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DINUCLEOTIDE RESTRICTION ENDONUCLEASE PREPARATIONS AND METHODS OF 
USE • 

FIELD OF TFTF. TWF|^QTg 
The present invention relates generally to isolated purified 
polynucleotides which encode restriction enzymes and to methods of expressing 
the restriction enzymes from such polynucleotides. More particularly this 
invention relates to isolated purified polynucleotides which encode CviJl and 
related methods for the production of this enzyme. 

Other aspects of the invention relate to methods for partially or 
completely digesting DNA at a dinucleotide sequence. More particularly, this 
aspect of the invention relates to methods of generating quasi-random fragments 
of DNA, and methods of cloning, labeling, and sequencing DNA, as well as 
epitope mapping of proteins. The invention also relates to methods for generating 
sequence-specific oligonucleotides from DNA, without prior knowledge of the 
nucleic acid sequence of such DNA, and to methods for cloning and labeling 
DNA after restriction digestion by a two base recognition endonuclease reagent. 
This invention also relates to methods for cloning, labeling, and detecting nucleic 
acids using two base restriction endonuclease reagents, such as CviJ I, BsuR I, 
Aci I or CGase I. Further the invention relates to labeKng DNA by taking 
advantage of certain properties of the holo-enzyme of thermostable DNA 
polymerases. 



BACKflnoTmn nr tt te inventton 
Restriction endonucleases are a group of enzymes originally found 
to be expressed in a wide variety of prokaryotic organisms. More recenUy they 
have also been found to be encoded in viral genomes. These enzymes catalyze 
the selective cleavage of DNA at generally short sequences, often unique to the 
individual enzyme. TWs abUity to cleave makes restriction endonucleases 
indispensible tools in recombinant DNA technology. The increased commercial 
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availability of the isolated enzyme.s has contributed in large part to the enormous 
expansion in the field of recombinant DNA technology over the last few years. 

These enzymes have been classified into three groups. Because of 
properties of the type I and type m enzymes, they have not been widely used in 
molecular biology applications, and will not be discussed further. Type D 
enzymes are part of a binary system known as a restriction modification system 
consisting of a restriction endonuclease that cleaves a specific sequence of 
nucleotides and a separate DNA modifying enzyme that modifies the same 
recognition sequence and thereby prevents cleavage by the cognate endonuclease. 
A total of about 2103 restriction enzymes are known, encompassing 179 different 
type n specificities (Roberts, et ai., NucL Acids Res. 20:2167-2180 (1992)). 
Although there are more than 1200 type n restriction enzymes, many of them are 
members of groups which recognize the same sequence. Restriction enzymes that 
recognize the same sequence are said to be isoschizomers. 

The vast majority of type H restriction enzymes recognize specific 
double-stranded sequences which are four, five, or six nucleotides in length and 
which display twofold (palindromic) symmetry. A few enzymes recognize longer 
sequences or degenerate sequences. 

The location of cleavage sites within a palindrome differs from 
enzyme to enzyme. Some enzymes cleave both strands exactly at the axis of 
symmetry generating fragments of DNA that carry blunt ends. whUe others cleave 
each strand at similar sequences on opposite sides of the axis of symmetry, 
creating fiagments of DNA that cany protruding, single-stranded termini. 

Restriction endonucleases with shorter recognition sequences cut 
DNA more frequenUy than those with longer recognition sequences. For 
example, assuming a 50% G-C content, a restriction endonuclease with a 4-base 
recognition sequence will cleave, on average, every 4^ (256) bases compared to 
every 4 (4096) bases for a restriction endonuclease with a 6-base recognition 
sequence. Under certain conditions some restriction endonucleases are capable 
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of cleaving sequences which are similar but not identical to their defined 
recogniUon sequence. This altered specificity has been termed "star" (*) activity 
and is observed only under certain non-standard reaction conditions. The manner 
in which an enzyme's specificity is altered depends on the particular enzyme and 
on the conditions employed to induce the star activity. Conditions that contribute 
to star activity include high glycerol concentration, high ratio of enzyme to DNA, 
low ionic strength, high pH, the presence of organic solvents, and the substitution' 
of Mg++ with other divalent cations. The most common types of star activity 
involve cutting at a recognition sequence having a single base substitution, cutting 
at sites having truncation of the outer bases of the recognition sequence, and 
single-strand nicking. The following restriction endonucleases show star activity: 
Ase I, BamH I. BssH H. BsuR I. CviJ I. EcoR I, EcoR V, Hind m, Hinf I, Kpn 
I, Pst I, Pvu n, Sal I. Sea I. l^q I, and Xmn I. Star activity is generally viewed 
as undesirable, and of litUe intrinsic value. 

Of the 179 unique type H restriction endonucleases, 31 have a 4- 
base recognition sequence, 11 have a 5-base recognition sequence. 127 have a 6- 
base recognition sequence, and 10 which have recognition sequences of greater 
than 6 bases. In two cases, a restriction endonudease has a recognition sequence 
of less than 4 bases. 

The restriction enzyme CviJ I has a three base recognition sequence 
or a two-base recognition sequence, depending on the reaction conditions. Under 
normal reaction conditions CviJ I recognizes the sequence PuGCPy (wherein 
Pu=purine and Py=pyrimidine) and cleaves between the G and C to leave blunt 
ends (Xia et al.. 1987. Nucleic Acids Res. 15:6075-6090). Under 'relaxed" or 
"star" conditions fm the presence of 1 mM ATP and 20 mM DTT) the specificity 
of CviJ I may be altered to cleave DNA more frequenUy . This activity is referred 
to as CvU I*. for star or altered specificity. However. CvU I* activity is not 
observed under conditions which favor star activity of other restriction 
endonucleases. 
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The restriction enzyme BsuR I normaUy recognizes the sequence 
GGCC and cleaves between the G and C to leave blunt ends. (Heininger, et al. , 
Gene 1:291-303 (1977)). Under relaxed conditions (high pH. low ionic strength, 
and high glycerol concentration) the specificity of Bsu RI may be altered to cleave 
DNA more frequently. An isoschizomer of this enzyme, Hae EI, does not display 
this star activity. 

In bacteria, the restriction endonuclease provides a mechanism of 
defense against foreign DNA molecules (e.g., bacteriophage DNA) by virtue of 
its abiUty to distinguish and cleave only exogenous DNA, leaving endogenous 
bacterial DNA unaffected. Viral endonucleases possess the same discerning 
capabilities, but rather than providing a means for defense, this activity has 
presumably evolved to cripple the host's abiUty to repUcate its own DNA and 
aUows the virus to assume control of the host's replication machinery. 

Bacteria and viruses which express restriction endonucleases 
necessarily possess the inherent abiUty to protect their own genome from cleavage 
by their endogenous endonuclease. The primary mechanism by which this is 
accomplished is by modifying the organisms own DNA by, for example 
methylating a base in the recognition sequence which prevents binding and 
cleavage by the endonuclease. Therefore, to insure viability, the genome of an 
organism which expresses a restriction endonuclease is almost always heavily 
modified, usually by methylation of cytosine or adenosine bases. TTie methylase 
enzyme which modifies the genome (itself a useful tool in molecular biology) acts 
in tandem with the endonuclease, either as part of an enzyme complex 
(restriction/modification complex) or as two distinct entities. Therefore, 
recognizing that an organism expresses an enzyme with endonuclease activity 
strongly suggests the expression of an associated modifying methylase enzyme 
(and vice versa) and this association has led to isolation and cloning of a number 
of commercially available restriction/modification enzymes for use in the 
laboratory as discussed below. 
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One of the limitations in the use of restriction endonucleases exists 
when cleavage of a given sequence is required and no known endonuclease exists 
which is specific for that particular sequence. Therefore, the continued 
identification and isolation of unique restriction endonucleases and altered reaction 
conditions win aUow for even more sophisticated manipulation of DNA in vitro. 

A number of publications and patents describe the cloning of DNAs 
encoding restriction endonucleases. Included among theses publications is Kiss. 
A., et al, Nucleic Acid Research 13:6403-6421 (1985), which describes the 
cloned nucleotide sequence of the BsuBl restriction-modification system isolated 
from Bacillus subtillis. This system is specific for the sequence 5 '-GGCC-3 ' and 
is defined by two gene products which are transcribed by different promoters. 
The methylase component of the system shows homology to the methylase from 
the BspRI and SPR restriction-modification systems. 

Nwanko, D.O. and Wilson, G.G. Gene 64: 1-8 (1988), describe tiie 
cloning and expression of the Mspl restiiction and modification genes isolated 
from Moraxella sp. This system recognizes the sequence 5 '-CCGG-3 ' and both 
enzymes are fiinctional in E. coli. Evidence indicates that these genes are 
transcribed in opposite directions, thus are probably under the control of different 
promoters. 

Ashok, K.D., etal, Nucleic Acids Research 20:1579-1585 (1992), 
describe the purification and characterization of cloned Mspl methyltransferase, 
over-expressed in E. coli. At low concentrations the enzyme exists as a 
monomer, but at higher concentrations it exists mainly as a dimer. Polyclonal 
antibodies to the enzyme cross-react with methyltransferase genes of other 
25 modification systems. 

Brooks, I.E., et al. Nucleic Acids Research 19:841-850 (1991), 
characterizes the cloned flawHI restriction modification system from Bacillus 
subtilis. The two genes are divergenUy oriented and separated by an open reading 
frame which may serve as a transcriptional rcguktor in the native bacteria. 
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Slatko, B.E., et al. Nucleic Acids Research 15:9781-9796 (1987), 
describe the cloning, sequencing and expression of the Tagl restriction- 
modification system. These genes have the same transcriptional orientation, with 
the methylase gene 5 ' to die endonuclease gene. E. coU clones which cany only 
die endonuclease gene are viable even in the absence of the methylase gene. This 
is an unusual case possibly explained by the tS^C optimal temperature for Taql 
restriction and die 37°C optimal temperature for E. coU growth. 

Howard, K.A., ef al.. Nucleic Acids Research 14:7939-7951 
(1986). describe die cloning of die Ddel restriction modification system from 
Desulfovibrto desulfuricans by a two step mediod wherein the mediylase gene is 
first cloned and transformed into £. coli, followed by the cloning of die 
endonuclease gene and txansfonnation of diis second gene into die mediylase- 
expressing bacteria. In order to maintain cell viability, high levels of meUiylase 
expression are required before die endonuclease gene can be introduced into die 
15 bacteria. 

Ito, H., et al.. Nucleic Acids Research 18:3903-3911 (1990), 
describe die cloning, nucleotide sequence and expression of die flincH restriction- 
modification system. The DNA was isolated from H. influenzae Rc, widi die two 
genes positioned in die same transcriptional orientation. 

Shields, S.L., et al.. Virology 76:16-24 (1990), describe die 
cloning and sequencing of die cytosine mediyltransferase gene M.CviU from die 
Chlorella virus IL-3A. The mediylase recognizes die sequence (G/A)GC(T/C/G) 
and shows amino acid sequence homology with 5-mediylcytosine mediylases 
isolated from bacteria. DNA encoding die mediylase was obtained from die viral 
genome which was propagated in die green alga host Chlorella. 

Xia, Y., et al., Nucleic Adds Research 15:6075-6090 (1987), 
discovered diat IL-3A virus infection of Chlorella-m, green alga induces the 
expression of die DNA restriction endonuclease CvOl which has novel sequence 
specificity. TTiis endonuclease recognizes die sequence PuGCPy (wherein Pu = 
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purine and Py = pyrimidine) but does not cut the sequence PuOn^CPy, where '"C 
is 5-methylcytosine. 

U.S. Patent 5,137,823, issued August 11, 1992, to Brooks, J.E., 
describes a two step method for cloning the Bamm restriction modification 
system wherein the methylase is cloned first and then introduced into a bacterial 
host. The endonuclease is then cloned and introduced into the methylase 
expressing bacteria. This two step procedure provides the host DNA protection 
from cleavage of the subsequenfly introduced endonuclease. 

U.S. Patent 5,200.333, ('333) issued April 6, 1993, to Wilson, 
G.G., describes a method for cloning restriction and modification genes. 
Specifically this reference describes the cloning of the Tagl and HaeU systems 
from Themus aquaticus and Haemophilus aegypticus, respectively. In this 
method, bacterial DNA was initially purified and digested, and the fragments 
were then cloned into a vector to produce a bacterial DNA Ubrary. The Hbrary 
was then transformed into E. coli and the cells were plated. Colonies were then 
scraped from the plate to form a primary cell library. Plasmid DNA from this 
cell library was purified and digested with the endonuclease of the two gene 
system. Bacteria which expressed the methylase gene had modified plasmid DNA 
which was protected from endonuclease activity, while plasmids from bacteria 
which lacked the intact methylase gene were digested. TTie resulting, undigested 
plasmid DNA was then transformed into another bacterial strain and the bacteria 
were plated. Surviving colonies were again harvested to give a secondary cell 
library and the entire procedure rq)eated. Plasmids which code for the complete 
restriction-modification system presumably survived each round of purification 
and were enriched. Bacteria which survive several rounds of emichment were 
subsequenUy assayed for both methylase and endonuclease activity. 

U.S. Patent 5,196,331, ('331) issued March 23, 1993. to Wilson, 
G.G. and Nwanko, D., describes a method for cloning the Mspl restriction and 
modification genes. This patent describes a method identical to that of U.S. 
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Patent 5,200,333 ('333). '331 is a continuation-in-part of, and '333 is a 
continuation of U.S.S.N. 707,079 (now abandoned). 

As mentioned above, Chlorella virus IL-3A encodes a unique 
restriction endonuclease called CwTI (Xia et al Nucleic Acids Res. 15:6075-6090 
(1987)). IL-3A is a large, polyhedral, plaque-forming phycodnavirus (Francki, 
R.I.B.. etal Arch.Virol suppl.2. Springer-Verlag, Vienna (1991)) that replicates 
in uniceUuIar, eukaryotic green algae, Chlorella strain NC64A (Schuster, A.M., 
etal Wroto©. 150:170-177(1986)). TTie double-stranded DNA genome of IL-3 A 
is approximately 330 kbp (Rohozinski et al.. Virology 168:363-369 (1989)) and 
contains 9.7% methylated cytidine (Van Etten, J.L. et al.. Nucleic Acids Res. 
13:3471-3478 (1985)). The cognate methyltransferase of CWJI, M.CVai, 
methylates (A/G)GC(T/C/G) sequences and. has been cloned and sequenced 
(Shields, S.L. et al.. Virology 176:16-24 (1990)). 

The use of a two/three base recognition endonuclease. such as 
CWJI. to improve numerous conventional molecular biology applications as weU 
as permitting novel applications has been described in co-pending U.S. Patent 
Application Ser.No. 08/036,481. filed on March 24. 1993. TTie application 
discloses methods for generating sequence-specific oligonucleotides from DNA 
without prior knowledge of the nucleic acid sequence of such DNA, and to 
methods for cloning and labeling DNA after restriction digestion by a two base 
recognition endonuclease. The application also teaches methods for generating 
quasi-random fragments of DNA. methods for cloning, labeling, and sequencing 
DNA. as well as epitope mapping of proteins. The abiKty to generate numerous 
oligonucleotides with perfect sequence specificity or quasi-random distributions 
OfDNA fragments such as is possible with CviJl* has important impUcations for 
a number of conventional and novel molecular biology procedures. 

Infection of ChloreUa species NC64A with the IL-3A virus 
produces sufficient CwJI restriction endonuclease (CwOI) for research purposes. 
However, production of commercially useful amounts of CWJI is limited with this 
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system due to the slow growth of aiorella algae, the large number of 
contaminating nucleases associated with the virus, and the small yield of enzyme 
obtained after purification. In addition, biochemical and biophysical 
characterization of the enzyme, such as molecular weight determination, are 
difficult from the native source. Because of these limitations it would be useful 
to clone the gene for CvUI in order to provide an adequate large scale source of 
enzyme for use as a molecular biological reagent. 

SUMMARY OF THE TNVFNTION 
In one of its aspects, the present invention provides purified and 
isolated polynucleotides (e.g., DNA sequences and RNA transcripts thereof) 
encoding a unique restriction endonuclease, CviH, as weU as polypeptides and 
variants thereof which display activities characteristic of CvOI. Activities of CvHl 
include the recognition of specific DNA sequences, binding to these sequences 
and cleaving the bound DNA into fragments. Preferred DNA sequences of tiie 
invention include viral genomic sequences as well as wholly or partially 
chemically synthesized DNA sequences. RepKcas (ut., copies of Uie isolated 
DNA sequences made in vivo or in vitro) of DNA sequences of the invention are 
also contemplated. A preferred DNA sequence is set forth in SEQ ID NO: 2 
herein and is contained as an insert in the plasmid pCJHl.4. In another of its 
aspects, the invention provides purified isolated DNA encoding a CviU 
polypeptide by means of degenerate codons. 

Also provided are autonomously repKcating recombinant 
constructions such as plasmid DNA vectors incorporating CVfll sequences and 
especially vectors wherein DNA encoding CwH or a Cwn variant is operatively 
linked to an endogenous or exogenous expression control DNA sequence. 

According to another aspect of the invention, host ceUs such as 
prokaiyotic and eukaryotic cells, are stably transformed with DNA sequences of 
the invention in a manner aUowing the desired polypeptides to be expressed 
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thereui. Host cells expressing CWJI and CviH variant products are useful in 
methods for the large scale production of CviH and CWn variants wherein the 
cells are grown in a suitable culture medium and the desired polypeptide products 
are isolated from the host cells or from the medium in which the cells are grown. 
A preferred host cell is £. coli. Still another aspect of the invention is a 
recombinant CvUl polypeptide. 

The present invention is also directed to a method for the digestion 
of DNA with a restriction endonuclease reagent under conditions wherein said 
DNA is cleaved at a dinucleotide sequence selected from the group consisting of 
PyGCPy, PuGCPy, PuGCPu, and wherein Pu = purine and Py = pyrimidine. 

The present invention is also directed to a method for restriction 
endonuclease digestion of DNA comprising the step of digesting DNA with a 
restriction endonuclease reagent under conditions wherein said DNA is digested 
at 11 of 16 possible dinucleotide sequences and wherein said dinucleotide 
sequences are selected from the group consisting of PuCGPu, PuCGPy, and 
PyCGPu, and wherein Pu = purine and Py = pyrimidine. 

The present invention is directed to shotgun cloning of DNA, 
epitope mapping, and for labeUng DNA using the digestion methods of the present 
invention. The present invention provides methods for quasi-random fragmenting 
of DNA using the digestion methods of the present invention under conditions 
wherein the DNA is only partiaUy cleaved and the site preference of the 
restriction endonuclease reagent is greaUy reduced. By quasi-random is meant an 
overlapping population of DNA fragments produced by digesting DNA using the 
methods of the present inventions without apparent site-preference and which 
appears as a smear upon electrophoresis in a 1-2 wt. % agarose gel. The presem 
invention is also directed to the shotgun cloning and sequencing of quasi-random 
fragments of DNA produced by the methods of the present invention. Quasi- 
random fragments in the shotgun cloning method of the present invention are 
produced by partial digestion of DNA with a restriction endonuclease reagent 
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according to the methods of the present invention. More particularly, quasi- 
landom fragments of DNA useful in the cloning method of the present invention 
are produced by the partial digestion of the DNA to be cloned with CviJ I, BsuR 
I or with a restriction aidonuclease reagent termed CGase I comprising Taq I and 
Hpa n. Quasi-random fragments having a length of between about 100 and about 
10,000 nucleotides are preferred. More preferred are quasi-random fragments of 
about 500 to about 10,000 nucleotides in length. The present invention is also 
directed to the generation of quasi-random fragmentation of DNA using the 
method of the present invention for the purposes of epitope mapping and gene 
cloning. These quasi-random fragments are expressed either in vitro or in vivo 
and the smiallest fragment containing the desired function is identified by 
screening assays well known in the art. 

The present invention is also directed to the production of 
anonymous primers from any DNA without prior knowledge of the nucleotide 
sequence. The present invention provides methods for anonymous primer cloning 
and sequencing after complete digestion of DNA utilizing CviJ I, BsuR I or 
CGase I using the methods of the present invention. 

Additionally, the present invention is directed to methods of 
labeling and detecting DNA comprising the complete digestion of DNA using the 
methods of the present invention, foUowed by a heat denaturation step, to yield 
sequence specific oHgonucleotides. In particular, an aspect of the present 
invention involves labeling DNA with sequence specific oligonucleotides of about 
20 to about 200 bases in length (with an average size of between 20-60 bases) 
generated by CviJ I, BsuR I or CGase I digestion of the template DNA. 

More particularly, the invention is directed to restriction generated 
oligonucleotide labeling (RGOL) of DNA which comprises tiie digestion of an 
aliquot of template DNA with CvU I followed by a simple heat denahiration step, 
tiiereby generating numerous sequence specific oUgonudeotides, which can tiien 
be utilized for labeling nucleic acids by a number of methods, including primer 
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extension type reactions with a DNA polymerase and various labels, isotopic 
omon-isotopic (RGOL-PEL); 5' end labeling with polynucleotide kinase: 3' end 
labeling using tenninal transferase and various labels,isotopic or non-isotopic. 
labeling at the 3' end, also referred to as tailing, adds numerous labels per 
oligonucleotide (1-200). depending on the labeling conditions. The addition of 
10-500 oUgonucleotides generated per template, results in a significant signal 
ampUfication not obtainable by conventional methods. 

The invention is also directed to thermal cycle labeling (TCL) 
which comprises the simultaneous labeling and amplification of probes utilizing 
CviJ I or CGase I restriction generated oligonucleotides as the starting material. 
In this method, natural DNA of unknown sequence is digested with CviJ I to 
generate numerous double-stranded fragments which are then heat denatured to 
yield oligonucleotides. These oligonucleotides are combined with the intact 
template and subjected to repeated cycles of denaturation, amiealing. and 
extension in the presence of a thermostable DNA polymerase or functional 
fragment thereof which maintains polymerase activity, deoxynucleotide 
triphosphates and the appropriate buffer. Alpha 32p^TP (or any of the other 
three deoxynucleotide triphosphates), biotin-dUTP, fluorescein-dUTP. or 
digoxigenin-dUTP is incorporated during the extension step for subsequent 
detection purposes. Thermal cycle labeling efficiently labels DNA while 
simultaneously amplifying large amounts of the labeled probe. In addition, TCL 
probes exhibit a 10 fold improvement in detection sensitivity comparted to 
conventional probes. 

The present invention is also directed to TCL in which the 
thermostable DNA polymerase supplies endogenous primers for enzymatic 
extension. This method is referred to as Universal Thermal Cycle Labeling 
(UTCL). In this method natural DNA of unknown sequence is combined intact 
with the holo-enzyme of a thermostable DNA polymerase, deoxyribonucleotide 
triphosphates, and the appropriate buffer. The holo-enzyme and its associated 
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endogenous primers are then combined with intact template and subjected to 
repeated cycles of denaturation annealing and extension. Alpha ^^T-^ATP, 32p. 
dTTP, 32p.dGTP, ^VdCTP, biotin-dUTP, fluorescein-dUTP, or digoxigenin- 
dUTP is also included in the extension step for subsequent detection purposes. 
5 Isotopic labels useful in the practice of the present invention include but are not 
limited to 32p, 33p^ 35s, l^C and %. Non-isotopic labels useful in the present 
invention include but are not limited to fluorescein biotin, dinitrophenol and 
digoxigenin. 

The present invention is also directed to an improved method for 
10 purifying CvU I from the algae Chlorella infected with the virus IL-3A. 

In addition the present invention is directed to restriction 
endonuclease reagents which, under conditions which relax the sequence 
specificity of one or more restriction endonucleases, cleave DNA at the 
dinucleotide sequences AT or TA. 

The present invention is also directed to a restriction endonuclease 
reagent comprising in combination, Taq I and Hpa H, which is capable of 
digesting DNA at 11 of 16 possible dinucleotide sequences, said sequences 
selected from the group consisting of PuCGPu, PuCGPy, PyCGPy and PyCGPu, 
and wherein Pu = purine and Py = pyrimidine. 

The following examples are intended to be illustrative of the several 
aspects of the present invention and are not intended in any way to limit the scope 
of any aspect of the present invention. 

BRIEF PESCRTPTfON OF THE TTOAWi^vrr,^ 
Figure 1 is a map of the plasmid p710 which contains DNA 
25 sequences encoding for the IL-3A viral methyltransferase U.CviJl; 

Figure 2 is the nucleotide sequence of 5497 bp of cloned IL-3A 

viral DNA; 
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Figure 3 is a restriction map of the cloned IL-3A viral DNA, 
including the identified open reading frames; 

Figure 4 is the DNA sequence of the CvUl gene with its flanking 
regions. The predicted amino acid sequence is provided below the nucleotide 
sequences; 

Figure 5A depicts the theoretical frequency and distribution of 
cm* restriction generated oligomers of individual lengflis; Figure 5B shows tiie 
actual frequency and distribution of Cvin' restriction generated oUgomers of 
various lengths; 

Figure 6 is a flow chart depicting anonymous primer cloning; 
Figure 7 is a photographic reproduction of a gel depicting CviJI 
restriction digests of pUC19; 

Figure 8 is a photographic reproduction of a gel depicting 
comparisons of sonicated versus Cvin* partiaUy digested DNAs; 

Figure 9A is a photographic reproduction of an agarose gel 
electrophoresis analysis of size-fractionated DNA by microcolumn 
chromatography compared to fractionation by agarose gel electroelution; 

Figure 9B-E illustrates additional trials of tiie same procedures 
used in Figure 9A; 

Figure lOA illustrates tiie size distribution of DNA fragments 
produced by partial digestion of DNA by Cvin and fractionated by microcolumn 
chromatography; 

Figure lOB-C illustrates tiie size distribution of DNA fragments 
produced by partial digestion of DNA by Cvin and fractionated by agarose gel 
25 electrophoresis; 

Figure 1 1 is a schematic depiction of tiie distribution of Cvin sites 
in pUC19; and 

Figure 12 is a graph of tiie rate of sequence accumulation by 
CViJI shotgun cloning and sequencing. 



15 
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DETATLED n ESCRTPTTn]V 
The gene for the restriction endonuclease R.CwJI was cloned into 
E. coli so as to provide an adequate source of R.CVin for use as a molecular 
biological reagent. BiologicaUy active CWH has been purified from Rcoli to 
apparent homogeneity. The molecular weight of £co// derived R.CwH is 32.5 
kD by SDS gel electrophoresis. N-terminal amino acid sequence analysis of this 
protein and comparison to the nucleotide sequence of the gene revealed that the 
translation of this enzyme is probably initiated with a GTG start codon, instead 
of the usual ATG initiation codon. The structural gene is 834 nucleotides in 
length coding for a protein of 278 amino adds (31.6 kD). A second peak of 
R.Cvin activity which dutes separately from the 32.5 kD form can be seen in the 
initial stages of enzyme purification. Trace amounts of a larger molecular weight 
form have not been observed to date. However, the R.CVin gene does possess 
an in-ftame upstream ATG codon which if translated would yield a predicted 41.4 
kD protein. The structural gene for tiiis potentially larger product is 1074 
nucleotides in length coding for a putative protein of 358 amino acids. 

The present invention is also directed to a method for tiie 
fragmentation and cloning of DNA using tiie restriction endonuclease CviJ I under 
conditions which allow die enzyme to cleave DNA at tiie dinucleotide sequence 
GC. In addition, tiie present invention is also directed to tiie cloning of quasi- 
random fragments of DNA digested using tiie ftagmentation metiiod of die present 
invention. 

As an alternative to tiie metfiods for constiiicting random clone 
Ubraiies described above, metiiods were devised for tiie construction of such 
Ubraries which require fewer steps and reagents, which require smaUer amounts 
of DNA, which have relatively high cloning efficiencies and which takes less time 
to complete. These metiiods relate to tiie recognition tiiat a partial digest witii a 
two or tiiree base recognition endonuclease cleaves DNA frequentiy enough to be 
functionaUy random witii respect to tiie rate at which sequence data may be 
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accumulated from a shotgun clone bank. The restriction en2yme CvU I normally 
recognizes the sequence PuGCPy and cleaves between the G and C to leave blunt 
ends (Xia et al. Nucl. Acids Res. 15:6075-6090 (1987)). Under "relaxed" 
conditions (in the presence of 1 mM ATP and 20 MM DTT) the specificity of 
CviJ I can be altered to cleave DNA more frequently and perhaps as frequently 
as at every GC. This activity is referred to as CviJ I*. Because of the high 
frequency of die dinucleotide GC in all DNA (16 bp average fragment size for 
random DNA), quasi-random libraries may be constructed by partial digestion of 
DNA with CviJ I . A DNA degradation method with low levels of sequence 
specificity produces a smear of the target DNA when analyzed by agarose gel 
electrophoresis. Digestion of the plasmid pUC19 under partial CviJ I* conditions 
does not result in a non-discrete smear; rather, a number of discrete bands are 
found superimposed upon a light background of smearing, suggesting that CviJ 
I has some site preference. Atypical reaction conditions according to the present 
invention eliminate this apparent site preference of CviJ I* to produce an activity 
(termed CviJ I ) in combination with a rapid gd filtration size exclusion step, 
streamlines a number of aspects involved in shotgun cloning. 

One aspect of the present invention involves the use of the 
two/three base recognition endonuclease CviJ I, in conjunction with a simple spin- 
column method to produce libraries equivalent in final form to those generated by 
the combination of sonication and agarose gel electroelution. However, the 
method of the present invention requires fewer steps, a shorter time period, and 
significantly less substrate (nanogram amounts) when compared to conventional 
procedures. Both small and large sequencing projects using the methods 
described herein are within the scope of the present invention. 

Current sequencing paradigms require the generation of a new 
template for each 350-500 nucleotides sequenced. On this basis, sequencing both 
strands of the human genome would require at least 12 million templates 500 
nucleotides long, assuming no overiap between templates. 
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A random approach, such as shotgim sequencing, would require 30 
to 50 million templates, assuming the entire genome were randomly subcloned. 
As many as 250,000 Ubraries may be needed to generate the requisite templates 
from a subcloned and ordered array of this genome, depending on the type of 
vector utilized, and the degree of overlap between such clones. The abiKty to 
generate shotgun Ubraiies in a semi-automated, microtiter plate format would 
greafly simplify such large scale projects. 

The development of methods for cloning large DNA molecules in 
yeast artificial chromosomes (Burke et al. Science 236:806-812 (1987), or in 
bacteriophage Pl-derived vectors (Sternberg, Proc. Nail Acad. ScL USA 87:103- 
107 (1990)), simpUfies the subdivision and analysis of very large genomes. 
However, the large size of the resulting subclones (100 - 1000 kbp) presents 
additional challenges for subsequent sequencing efforts. A report of the 
sequencing of a 134 kbp genome by random shotgun cloning direcUy into a 
bacteriophage M13 vector indicates that numerous intermediate stages of 
subcloning, mapping, and overlapping such clones may be eliminated (Davison, 
J. DNA Seq. and Mapping 1:389-394 (1992). An order of magnitude reduction 
in the amount of DNA required for shotgun cloning would substantially simplify 
efforts to directly sequence 100,000 bp sized molecules and beyond. 

The abiUty to generate an overlapping population of randomly 
fragmented DNA molecules is considered essential for minimizing the closure of 
nucleotide sequence gaps by the shotgun cloning method. The use of a very 
frequent-cutting restriction enzyme, such as CviJ I, is an approach which has not 
been utiUzed. Reaction conditions according to the present invention result in the 
quasirandom restriction of pUC19 and lambda DNA, as judged by the degree of 
smearing observed. 

The randomness of this CvLT I** reaction was quantified by 
sequence analysis of 76 suchpartially-fragmented pUC19 subclones. TTie analysis 
is showed that CviJ I** partial digestion (Umiting enzyme and time) restricts DNA 
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at PyGCPy, PuGCPu, and PuGCPy (but not PyOCPu), and is thus a hybrid 
reaction which combines the three base recognition specifity of CviJ I with the 
"two" base recognition specifity of CviJ I*. Interestingly, most of the "relaxed" 
cleavage observed under CviJ T* conditions occurred in those portions of the 
5 sequence which were deficient in "normal" restriction sites. CviJ I** treatment 
produces a relatively uniform size distribution of DNA fragments, permitting 
sequence information to be accumulated in a statistically random fashion. 

Shotgun cloning with CviJ I** digested DNA is efficient partly 
because the resulting fragments are blunt ended. Other methods currenUy used 
10 to randomly-fragment DNA, including sonication, DNAse I treatment, and low 
pressure shearing, leave ragged ends which must be converted to blunt ends for 
efficient vector Ugation. Other than a heat denaturation step to inactivate the 
endonuclease, no additional treatments are required for cloning CviJ I** restricted 
DNA. In addition, the preligation step required to equalize representation of the 
15 ends of a DNA molecule prior to sonication or DNAse I treatment is not 
necessary with CvU I** fragmentation. CviJ I* cleaves its cognate recognition 
site very close to the ends of a linear molecule, as judged by the very small 
fragments resulting from complete digestion of pUC19 as depicted in Figure 2, 
lane 1. 

The overall efficiency of shotgun cloning depends not only on the 
fragmentation process, but also upon the size fractionation procedure used to 
remove small DNA fragments. The efficiency of cloning agarose gel fractionated 
DNA was found to be unexpectedly variable. Numerous experiments produced 
an erratic distribution of sized material and the resulting cloned inserts were 
25 uniformly small (70% < 500 bp in one trial, 100% < 500 bp in another). TTie 
method of the present invention includes a simple and rapid micro-column 
fractionation method, which has resulted in three to thirteen times more 
transformants than agarose gel fractionation. More importanUy, the size 
distribution of the cloned inserts from column-fractionated DNA was skewed 
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toward larger fragments (88% > 500 bp). Micro-column fractionation also 
ehmmates the chemical extraction steps required for agarose fractionated DNA 
After the target DNA has been column-fractionated, no further treatments are 
required for cloning. Combining CviJ f partial restriction with microcolumn 
fractionation permits the construction of useful libraries from as little as 200 ng 
of substrate, an order of magnitude less starting material than recommended for 
sonication/end-repair and agarose gel fractionation procedures. 

The CviJ I** reaction represents a unique alternative for controlling 
the partial digestion of DNA. a technique which is f\,ndamental to the construction 
of genomic libraries (Maniatis et al Cell 15:687-701 (1978). and restriction site 
mapping of recombinant clones (Smith, et al Nucl Acids Res. 3:2387-2398 
(1976). Partial DNA digests are notably variable and are strongly dependent on 
the concentration and purity of the DNA. the amoum of enzyme used the 
incubation time, and the batch of enzyme. Partial digestions may also be variable 
with respect to the rate at which a particular recognition sequence is cleaved 
throughout the substrate. Optimal reaction conditions, such as those which render 
such partial digests independent of one or more of these variables, allows more 
precise control of the end product. Several controlling schemes may be 
employed, including: the addition of a constant amount of carrier DNA (Kohara 
etal. C.//50:495-508 (1987)). the use oflimiting amounts of Mg2+ (Albertson 
et al Nucl Acids Res. 17:808 (1989)). ultraviolet irradiation (Whitaker et al 
Gem 41:129-134), and the combination of a restriction enzyme and a sequence 
complementary DNA methylase (Hoheisel et al.. Nucl Acids Res. 17 9571-9582 
(1989)). Utilizing three different batches of CviJ I, and three different DNA 
templates from five separate preparations, a uniform CviJ r partial digestion 
pattern was obtained that was primarily time-dependent when a constant ratio of 
0.3 umts of enzyme per iig of DNA was used. 

The rate at which a particular restriction site is cleaved at different 
locations in a substrate is variable for many endonucleases (Brooks, et al.. 
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Methods in Enzymol, 152:113-129 (1987)). Reaction conditions for CvU I may 
be optimized to substantially reduce the site preferences of this enzyme during 
partial digestion (see Figure 2, lanes 3 and 4). Normally, "star" reaction 
conditions result in cleavage at new sites. The use of star reaction conditions 
according to the present invention (dimethyl sulfoxide [DMSO] and lowered ionic 
strength) to affect the partial digestion activity of CviJ I** does not result in an 
altered restriction site cleavage as assayed by sequencing the products of 76 
digestion reactions. Instead, the relative rate of cleavage of individual sites 
appears to be more unifonn under these conditions. A 3-5 fold increase in tiie 
rate of normal CviJ I restriction with the standard buffer and DMSO further 
substantiates tiiis approach. All of these results indicate that, under tiie 
appropriate reaction conditions, CviJ I is useful for a number of other 
appUcations, such as high resolution restriction mapping and fingerprinting, 
diagnostic restriction of small PCR fragments, and construction of genomic DNA 
15 libraries. 

Anotiier aspect of the present invention involves quasi-random 
fragmentation of DNA using the method of the present invention for epitope 
mapping and cloning intact genes. The same method as described above for 
shotgun cloning is utiUzed, except that an expression vector is used to generate 
20 functional proteins from the DNA. 

Another aspect of the present invoition involves fragmenting DNA 
using the present invention to generate multiple oUgonucleotides from any double- 
stranded DNA template. Restriction-generated oligonucleotides (RGO) are 
sequence specific oligonucleotides generated from any DNA according to the 
present invention. CviJ I* presumably cleaves the recognition sequence GC 
between the G and C to leave blunt ends (Xia et al, Nucl Acids Res. 15:6075- 
6090, (1987)). Because of the high frequency of dinucleotide GC in all DNA 
(16bp average fragment size for random DNA), a complete CvU I* restriction 
results in numerous fragments which are about 20-200 bp in size. These 



25 



wo 94/2160 



PCT/US94/0324« 



10 



-21 . 

restriction fragments are generated from an aliquot of the template itself and are 
heat-denatured, to yield numerous single-stranded oUgonucleotides which are of 
variable length but which are specific for the cognate template. Complete CviJ 
I restriction of the smaU plasmid pUC19 (2689 bp) theoreticaUy yields 314 
oligonucleotides after a heat-denaturation step. The ability to generate numerous 
oligonucleotides with perfect sequence specificity is an unusual result of the use 
of this class of enzyme according to the present invention. Such oligonucleotides 
are uniquely suited for purposes of labeling DNA, as described below. 

One appUcation of CviJ I* restriction-generated oligonucleotides is 
to directiy label them using conventional metiiods. There are several important 
advantages in using CviJ I* restriction-generated oUgonucleotides. Conventional 
methods employing synthetic oligonucleotides for detection purposes generaUy use 
one oligonucleotide containing one or a few labels. A complete CviJ I* digest 
generates hundreds of oligonucleotides from a given template, depending on the 
size of the template, and tiius makes hundreds of sites available for labeling, 
regardless of tiie labeling scheme utilized. These hundreds of sequence specific 
restriction-generated oligonucleotides have two important advantages over 
conventional probes used in nucleic acid detection metiiods. First, tiie generation 
of multiple oligonucleotide probes directed at multiple sites in a given target 
20 (tiieoretically, 314 sites in pUC19) provides enhanced detection sensitivities 
compared to syntiietic oUgonucleotides which are directed at 1 or a few sites in 
a target. The numerous labeled restriction-generated oUgonucleotides represent 
a 10-100 fold ampUfication of tiie signal for detection compared to tfie use of a 
single oligonucleotide. Second, tiie short lengtii of tiie restriction-generated 
25 oUgonucleotides permits more efficient hybridization. This is important for two 
reasons. First, hybridization times using restriction-generated oUgonucleotides is 
reduced to 1 hr as opposed to an overnight incubation witii conventional probes 
hundreds of nucleotides in lengtii. This is a very important advantage when using 
oUgonucleotide probes in cUnical settings. Second, tiie penetration of pr bes into 
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penneabilized ceUs is a critical issue for in situ hybridizaUon procedures. The 
smaller the probe, the easier the entry into the ceU. Thus, the use of multiple 
oligonucleotide probes generated by the two base cutters greatly improves the 
sensitivity of in situ hybridization, a technique of considerable importance in 
research and clinical labs. Finally, when using membrane-based hybridization 
procedures, only small sections of a target nucleic acid are exposed and available 
for hybridization. Multiple oUgonucleotides derived from a cognate template 
exhibit better detection sensitivities compared to long probes. 

Another application of restriction-generated oUgonucleotides for 
labeling is to employ them as primers in a polymerase extension labeling reaction 
in conjunction with a repetitive thermal cycling regimen of denaturation. 
annealing, and extension. Thermal Cycle Labeling (TCL) is a method for 
efficienfly labeling double-stianded DNA while simultaneously amplifying large 
amounts of the label^ probe. The TCL system employs the two base recognition 
endonuclease CviJ I* to generate sequence-specific oligonucleotides from the 
template DNA itself. These oligonucleotides are combined with the intact 
template and subjected to repeated cycles of denaturation, amiealing, and 
extension by a thermostable DNA polymerase from, for example, Themusflavus. 
A radioactive- or non-isotopically-labeled deoxynucleotide triphosphate is 
incorporated during the extension step for subsequent detection purposes. The 
ampUfied, labeled probes represent a very heterogeneous mixture of fragments, 
which appean as a large molecular weight smear when analyzed by agarose gel 
electrophoresis. Primer-primer amplification, a side product of this reaction 
(produced by leaving out the intact template in the TCL reaction), may result in 
enhanced detection sensitivity, perhaps by forming branched structures. Biotin- 
labeled probes generated by the TCL protocol detect as UtUe as 25 zeptomoles 
(2.5 X 10-20 moles) of a target sequence. A 50 TCL reaction yields as much 
as 25 of labeled DNA, enough to probe 25 to 50 Soutiiem blots. After 20 
cycles of denaturation and extension, biotin-dUTP-incorporated TCL probes may 
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be routinely detected at a 1:10^ dUution, which is 1000 fold more sensitive than 
RPL, and indicates that a significant degree of net synthesis or amplification of 
the probe is occurring. In addition, non-isotopically-labeled TCL probes exhibit 
a 10-fold improvement in detection sensitivity when compared to RPL-generated 
5 probes. ^ P-labeled probes generated by the TCL protocol may also detect as 
Utfle as 50 zeptomoles (2.5 xlO'^O moles) of a target sequence. As little as 10 
pg of template DNA is enough to synthesize 5-10 ng of radioactive version of 
TCL generates probes having extremely high specific activities, e.g. (about 5 x 
10^ cpm/ng DNA), which permits 5 to lO-fold lower detection limits than 
10 conventional labeling protocols. 

There are several advantages to using restriction-generated 
oUgonucleotides for primer extension labeling of DNA. One advantage is the 
specificity of the primers. All of the oligonucleotides generated by the TCL 
system are specific for the template utilized, unlike random primer labeling (RPL) 
15 which utilizes synthetic oKgonucleotides 6-9 bases in length having a random 
sequence. The amount of primer required for efficient labeling with the TCL 
system is only 10 ng, compared to the 10 ftg of random primers utilized for RPL. 
Due to their short length, random primers anneal very inefficienUy above 25- 
37 C, thus RPL is limited to DNA polymerases such as Klenow or T7. The size 
20 of the restriction-generated oligonucleotides are longer than the random primers, 
which extends the hybridization and extension conditions to include a wide variety 
of temperatures and polymerases. TTius, the use of the restriction-generated 
sequence-specific oligonucleotides results in more efficient hybridization and 
extension as compared to RPL. The TCL system has been optimized for labeling 
25 with a thermostable DNA polymerase which aUows the option of temperature 
cycling. After 20 cycles of denaturation and extension, a significant amount of 
amplified TCL probes can be generated. Most importanUy, TCL-labeled probes 
exhibit a 10 fold improvement in detections sensitivity when compared to RPI^ 
generated probes. 
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Another aspect of the present invention involves a variation of TCL 
called Universal Thermal Cycle Labelling (UTCL) in which the extension primers 
are not supplied by CviJI restricUon, but rather, are found endogenously in the 
enzyme preparations of thermostable DNA polymerases. Random sequence DNA 
is usually co-purified along with the holo-enzyme preparation of the thermostable 
DNA polymerases, regardless of die source of the enzyme, i.e. native or cloned. 
However, only the holo-enzyme, and not the exonuclease minus deletion variants, 
contain the endogenous DNA. TypicaUy, when the holo-enzymes of thermostable 
polymerases are used in protocols such as the polymerase chain reaction, the 
presence of such primers can create spurious results. Methods for circumventing 
the problems of endogenous DNA are described in PCR Protocols: A Guide to 
Methods and Applications, Eds. M. Innis. et al. Academic Press, 1990. 

This residual DNA is rather short (approximately 5-25 bases), as 
assayed by end-labeling with -^^KTP^ and polynucleotide kinase and acts as 
endogenous "random" primers in a TCL-type reaction. UTCL combines the holo- 
enzyme of a thermostable polymerase ftora. for example, Thermus flavus, with 
the intact DNA template and is subjected to repeated cycles of denaturation, 
annealing, and extension. A radioactive- or non-isotopically-labeled 
deoxynucleotide triphosphate is incorporated during the extension step for 
subsequent detection purposes. The amplified, labeled probe represents a very 
heterogenous mixture of fragments, which appears as a large molecular weight 
smear when analyzed by agarose gel electrophoresis. Biotin-labeled probes 
generated by the UTCL protocol detect as little as 25 zeptomoles (2.5 x 10*20 
moles) of a target sequence. A 15 UTCL reaction yields as much as 5-10 ^.g 
of labeled DNA. enough to probe 5 to 10 Southern blots. After 20 cycles of 
denaturation and extension. biotin-dUTP-incorporated UTCL probes may be 
routinely detected at a 1:106 dilution, which is 1000 fold more sensitive than 
RPL, and indicates that a significant degree of net synthesis or amplification of 
the probe is occurring. In addition, non-isotopicaUy-labeled UTCL probes exhibit 
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a 10-fold improvement in detection sensitivity when compared to RPL-generated 
probes. •'^P-labeled probes generated by the UTCL protocol may also detect as 
Uttle as 50 zeptomoles (2.5 xlO'^O moles) of a target sequence. The radioactive 
version of UTCL generates probes having extremely high specific activities, e.g. 
(about 5 X 10^ cpm/^g DNA), which permits 5 to 10-fold lower detection limits 
than conventional labeling protocols. 

The present invention is illustrated by the following examples 
relating to the isolation of a full length viral DNA clone encoding R.Cvin, to the 
expression of R.Cwn DNA in £.co/i strain DHSoF'MCR and to purification of 
R.CwJI from this bacterial stain. More particularly, Example 1 provides for the 
propagation of IL-3A virus and isolation of viral genomic DNA. Example 2 
addresses the improved expression of a clone for the viral methylase M.CwH . 
Example 3 describes the strategy for isolating and cloning the viral R. CVf JI gene 
by a forced co-cloning strategy of the U.Cvm gene. Example 4 describes the 
sequencing of cloned IL-3 A genomic DNA and identification of the R. CvOl gene. 
Example 5 relates the methods for purification of CvOl to homogeneity from an 
Kcoli strain, DH5aF'MCR, transformed with a plasmid which encodes the 
R-CvOl enzyme. Example 6 details the amino acid sequence analysis of the 
purified R. CvOI enzyme. Example 7 describes the analysis of Cvof recognition 
20 sequences. Example 8 relates to a technique for producing restriction generated 
oligonucleotides using CviJI. Example 9 relates the generation of anonymous 
primers using CviH. Example 10 describes end-labeling of CviH restriction 
generated oligonucleotides. Example 11 describes primer extension labeling of 
DNA using restriction generated oligonucleotides. Example 12 relates the use of 
25 CwJI in thermal cycle labeling of DNA as weU as the method of universal thermal 
cycle labelling. Example 13 provides a method for generation of quasi-random 
DNA fragments using CVm. Example 14 describes fractionation of CViJI digested 
DNA by size using spin column chromatography. Example 15 detaUs the relative 
cloning efficiency of CvOI digested, size-fractionated DNA by gel elution and 
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chromatographic methods. Example 16 describes the comparison of cloning 
efficiency using lambda DNA fragmented by both sonication and CV/JI** 
techniques. Example 17 details the use of CvOI** fragmentation for shotgun 
cloning and sequencing. Example 18 describes the shotgun cloning of lambda 
DNA using CvOl. Example 19 describes the use of CvOI in epitope mapping 
techniques. Example 20 describes the restriction endonuclease reagent CGase I. 



Example 1 
Propagation of ILr3A Virus 



The exsymbiotic Chlorella-like alga, NC64A, originally isolated 
from Paramecium bursaria (Karakashian. S.J. and Karakashian, M.Vf., Evolution 
and Symbiosis in the Genus Chlorella and Related Algae. Evolution 19:368-377 
(1965)), was grown and maintained in Bold's basal medium (BBM), (Nichols, 
H.W. and Bold, H.C. J. Phycol. 1:34-38 (1965)) modified by the addition of 
0.5% sucrose, 0.195 protease peptone, and 20 ^g/ml tetracycline (MBBM). 
Cultures were innoculated with 1 X 10^ algae cells/ml and grown at 25*'C in 250 
ml of MBBM in 500 ml Erlenmeyer flasks on a rotary shaker (150 rpm) in 
continuous light (ca. 30 mE, m-2.sec-l). Growth was monitored by light 
scattering measured as and/or by direct cell counts with a 

hemocytometer. 

When the cultures reached approximately 1 X 10^ algae cells/ml 
they were innoculated with filter sterilized (0.4 ^m nitrocellulose filter, 
Nucleopore, Pleasanton, California) IL-3A virus at a multiplicity of infection of 
0.01 and incubated for an additional 48 - 72 hours at 25*'C. The crude lysate was 
then centrifuged at 3000 ipm (2000 xg) for 10 minutes to remove ceUular debris. 
Nonidet P-40 was then added to 1% (v/v) and the virus was pelleted from the 
supernatant by centrifuging at 15,000 rpm at 4*'c for 75 minutes in a Beckman 
No. 30 rotor. The viral pellet was gentiy resuspended in 0.05 M Tris-HCl, pH 
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7.8, and the sample was layered on linear 10 - 40% sucrose gradients equiUbrated 
with 0.05 M Tris-HCl, pH 7.8, and centrifiiged for 20 minutes at 20,000 ipm at 
4*C in a Beckman SW28 rotor. The viral band, which was present in the center 
of the gradient as an opaque band, was removed, diluted with 0.05 M Tris-HCl, 
pH 7.8, and pelleted by centrifiigation at 15,000 ipm at 4°C for 120 minutes in 
a Beckman No. 80 rotor. The virus was resuspended in a smaU volume (10ml) 
of 0.05 M Tris-HCl, pH 7.8, and stored at 4°C. 

IL-3A viral DNA was purified from the viral particles using a 
modification of the protocol described by (MiUer, S.A., Dykes, D.D., and 
Polesky, H.I., Nucleic Acids Res. 16:1215 (1988)). Briefly, 100 /il of IL-3A 
virus (9.8 X 10^ ^ plaque forming units/ml) was dUuted with 400 fil of water and 
then mixed with 10 ^1 TEN (0.5 M Tris-HCl, pH 9.0, 20 mM EDTA, 10 mM 
NaCl) and 10 /xl of 10% SDS. After incubating at 70'C for 30 minutes the 
solution was extracted twice with phenol-chloroform-isoamyl alcohol, extracted 
once with chloroform, and precipitated with ice<old ethanol using methods weU 
known in the art and resuspended in 500 ftl of HjO. (Ausubel, F.M., Brent, R., 
Kingston, R.E., Moore. D.D., Seidman, J.G., Smith, J.A. and Struhl, K. (Eds.) 
(1987) Currera Protocols in Molecular Biology, WUey, New York; Sambrook, J., 
Fritsch, E.F. and Maniatis, T. (1989), Molecular Goning: A Laboratory Manual, 
20 Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York). 
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Example 2 
CvUI Methyltransferase Clone 



TTie CVUI methyltransferase gene (M.CWJI) from Otlorella virus 
IL-3A was cloned and sequenced by Shields et al.. Virology 176:16-24 (1990). 
Briefly, SauSA partial digest of Otlorella virus IL-3A was ligated to Bamm 
digested pUC19 and transformed into E. coli strain RRl . This Ubrary of plasmids 
was restricted with Hindm (AAGCTI} and Sstl (GAGCTQ, both f which are 
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inhibited by 5-methylcytidine (5mC) in the AGCT portion of their recognition 
sequences, and transformed again into RRl cells. M.CWJI methylates the internal 
cytidine in (G/A)GC(T/C/G) sequences. If the M.CvfH gene is cloned and 
expressed appropriately, the plasmid DNA would be expected to be resistant to 
Hindni and Sstl restriction. 

The CWJI methyltransferase gene was originally cloned as a 7.2 kb 
insert, termed pIL-3 A.22. Plasmid pIL-3A.22 was only partially resistant to CvUl 
digestion. Partial digestion is most likely due to the inefficient expression of the 
M.CwJI gene and the numerous CviTl sites in both the vector (pUC19 has 45 
Cvin sites) and in the insert DNA. The M. CviH gene was eventuaUy sublocalized 
to a region of 3.7 kb by subcloning using methods weU known in the art 
(Ausubel. F.M., Brent, R., Kingston, R.E., Moore, D.D., Seidman. J.G., Smith, 
J.A. and Struhl, K. (Eds.) (1987) Current Protocob in Molecular Biology, WDey. 
New York; Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989), Molecular 
Ooning: A Laboratory Manual Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, New York ) and testing the subcloned DNA for 
sensitivity/resistance to aVufln, and Cwn. (Shields « c/., ji(pra) TTie 
entire sequence was determined and three open reading frames which could code 
for polypeptides 161, 367, and 162 amino acids, respectively, were identified. 
The 367 amino acid open reading frame (ORF) was identified as the M. CwJI gene 
by three criteria: (i) it is the only ORF located in the region identified by 
transposon mutagenesis; (ii) it has amino acid motifs similar to those of other 
cytosine methyltransferases; and (iii) a 1.6 kb Dral fragment containing the 367 
amino acid ORF (1101 bp) produces the methyltransferase. This 1.6 kb M.CwTI 
encoding ftagment was subcloned into the EcdKV site of pBluescript KS(-) 
(Stratagene. LaJolla. CA). in the same translational orientation as the /ocZ ' gene 
of this vector. A physical map of the resulting plasmid termed p710 is shown in 
Figure 1. 
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The plasmid p710 was digested with several endonucleases to 
indirectly test the efficiency of M. CviJl expression. Fully acUve methylase should 
render the plasmid DNA completely resistant to digestion by the following 
enzymes: HaeUI (which recognizes the sequence GGCC), Sacl (which recognizes 
the sequence GAGCTC), and Hindm (which recognizes the sequence AAGCTT). 
The plasmid was partially resistant to Haem (90%) and Sacl (90%), and even less 
resistant to Hindm (25%) digestion. This lack of complete protection of the 
plasmid DNA made it impractical to attempt cloning the three/two base restriction 
endonuclease encoded by the R. CWJI gene. Thus, improvements in the efficiency 
of M.Cwn expression were required before attempting to clone the R.CWJI gene. 

The translation efficiency of the M.CWn gene was improved by 
removing extraneous 5' open reading frames, creating a perfect fusion of the 
lacZ ' Shine-Delgamo sequence with the methyltransferase start codon (see Figure 
1). This was achieved by site-specific oligonucleotide mutagenesis, using the 
15 oligomer 

5 '-CAAnTCACACAGGAAACAGCTATGTCTnTCGCACGTrAGAAC-3 ' 
(SEQ ID NO: 1) to precisely remove the intervening /ccZ' DNA. The relevant 
DNA sequences are indicated in Figure 1 (SEQ ID NO: 12). Hie mutagenesis was 
fecilitated by converting the double stranded plasmid DNA of p710 to single- 
stranded DNA by co-infecting the E. cott host strain with the helper phage R408 
(Russel, M., Kidd, S. and Kelly. M.R. Gene 45:333-338), using methods weU 
known in the art. The mutagenesis reaction was completed using a commercially 
available kit according to the manufacturer's instniction (Mutagene, Bio-Rad, 
Hercules, California). The oligonucleotide was annealed to the single-stranded 
plasmid, extended in the presence of T4 DNA polymerase, ligated using T4 DNA 
ligase, and transformed into competent SURE" cells (Stratagene, U Jolla, 
California). Transformed cells were then grown overnight as a pool, the DNA 
isolated and purified. 
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Enrichment for the mutagenized plasmids was made possible by 
virtue of the loss of an Xhol site located in the sequence that was deleted by 
mutagenesis. Enrichment was accomplished by digesting the isolated, purified 
plasmid DNA with Xhol, followed by dephosphorylation with calf intestinal 
alkaline phosphatase (CIAP), and transformed into SURE cells. Plasmid DNA 
was isolated from 18 individual colonies and the DNA tested for resistance to 
Xhol. Plasmid DNA from 11 colonies were resistant to Xhol digestion, indicating 
that they lacked the deleted sequence. Five of these plasmids were restricted with 
Haem, Hindm, PvuH (which recognizes the sequence CAGCTG), and CviH. All 
five appeared 100% resistant to these enzymes. Four of the plasmids were 
sequenced and the deletion was confirmed as being correct. One of these, 
pBMC5, was chosen for fiirther modification. 
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Example 3 
Forced Co-Cloning of R.Cv«7I 

The location of the R.CWn gene on the IL-3A virus genome was 
inferred as being 3 ' to the M.Cv/JI gene for two reasons: 1) the cloned DNA 
sequence 5' to the M.CWJI gene did not produce a restriction activity; and 2) 
several attempts to clone the DNA 3' to the M.CWn gene resulted in 
deletions/rearrangements of this downstream region. This information permitted 
a forced co-cloning strategy to obtain the restriction endonudease gene. This 
strategy uses a deletion derivative of pBMCS lacking the 3 ' half of the M.Cvm 
gene. Digestion of the IL-3A genome with the same enzyme used to create the 
M.CWJI deletion, foUowed by Ugation of the respective DNAs, transformation, 
and digestion with enzymes incapable of recognizing methylated DNA (e.g., 
Haem, Hindm, PvuU, CviJl, etc.) should force the selection of clones which 
have a restored M.CWJI gene (and thus active methylase enzyme), as weU as 
downstream DNA. Thus, if a clone is found to be CVm resistant, the 3 ' half of 
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M.Cvin must have been restored, and downstream DNA containing the R.Cwn 
gene, at least in part, would presumably be cloned. 

The details of this cloning strategy are as follows. pBMC5 has two 
Ecdm sites, one approximately in the middle of the M. CwJI gene, while the other 
site Ues in the vector DNA, 3 ' to the M.CvOT gene (see Figure 1). pBMC5 was 
restricted with EcoRI and ligated at a dilute concentration (10-50 ng/^xl) to favor 
circularization without the 3 ' M.CvUl ftagment. The reaction mixture was then 
transformed into competent SURE cells and plated on TY agar containing 
ampiciUin. Plasmid DNA from the resulting colonies was tested for the lack of 
this Ecom fragment by digestion with EcoKL. One of these clones, pBMC5RI, 
was used for the subsequent co-cloning work. Plasmid pBMC5RI was digested 
with Ecom and dephosphorylated using CIAP. IL-3A genomic DNA was then 
digested to completion with £coRI. The feoRI digested pBMCSRI and IL-3A 
DNAs were combined at a ratio of 1:3 in a Ugation reaction using T4 DNA 
ligase, and the products of the Ugation reaction were subsequenfly used to 
transform competent SURE ceUs. The pBMC5RIAL-3A transformants were not 
plated, but rather grown overnight in culture as a Ubrary or pool of cells. The 
ceUs were harvested the next day and DNA was isolated and purified. Isolated, 
purified DNA was digested with HaelE, dephosphorylated with QAP, and 
transformed into competent SURE cells. The ceUs were then plated and grown 
overnight. Six colonies grew, of which only one containing the plasmid, 
pCJHl.4, was resistant to Haem. The plasmid pCJHl.4 was found to encode 
cm restriction activity. Plasmid paHl.4 was further characterized to localize 
the gene for CWJI by deletion analysis, subcloning experiments, and sequencing. 
The plasmid pCJHl.4 was deposited with the American Type Culture Collection 
on June 30, 1993 under Accession Number 69341. 
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Example 4 

Sequencing of Cloned IL-3A DNA Containing CviJI Gene 

The EcoRI fragment cloned into pCIHl .4 (as described in Example 
3) is 4901 bp in length. Except for the 519 bp corresponding to the 3 ' portion 
5 of the M.CWn gene, the remainder of the 4901 bp EcoR I fragment cloned into 
pCJHl.4 was sequenced using the SEQUAL DNA Sequencing System 
(CHIMERx, Madison, WI) by methods weU known in the art. Sequencing was 
accompUshed using three approaches: 1) primer walking on pCHJl.4, 2) cloning 
various restriction cndonuclease digests of pCHJl.4 into an M13 type sequencing 
10 vector; and 3) sequencing various restriction endonuclease deletion derivatives of 
pCHJl.4. The nucleotide sequence of 5497 bp of IL-3A viral DNA is shown in 
Figure 2 and set forth in SEQ ID NO.: 2. 

Six open reading frames (ORF) of 1155 bp (ORFl), 468 bp 
(0RF2). 555 bp (ORF3), 1086 bp (0RF4), 397 bp (0RF5) and 580 bp (0RF6) 
which could code for polypeptides containing 358 (41.4 kD), 156 (19.4 kD), 185 
(20.3 kD). 362 (38.9 kD), 132 (14.5 kD) and 193 (21.9 kD) amino aLids, 
respectively, were identified (see Figure 3). ORFs 4-6 do not code for the 
R.Cvin gene, as the deletion derivative pCdA12, which lacks the DNA between 
the Aval and BamHI sites (see Figure 3), does produce CWJI restriction 
endonuclease activity. In addition, the deletion derivative pCdEB7, lacking the 
DNA between the EcoRI and BamHI sites, did not produce CWJI activity. Thus 
ORFl or ORFS were the most likely candidates for encoding the R. CviJI gene. 
The sequence of the 1155 bp ORFl (SEQ ID NO: 3), its deduced amino add 
sequence (SEQ ID NO: 4) (as shown in capital letters), plus flanking bases, is 
presented in Figure 4. The vertical line in Figure 4 and the associated arrow 
indicate where Uie DNA sequence from pJCHl.4 diverges from that of pIL- 
3A.22-8 (Shields, S.L.. et al, Virology 76:16-24, 1990). This open reading 
(ORFl) frame is believed to represent the CvOl gene because 14 out of 15 N- 



15 



wo 94/21«63 



PCT/US94/03246 



-33- 

tenninal amino acids from the protein sequence (see Example 6) matched the 
predicted translation product of the nucleic acid sequence (Figure 4). Also, the 
32.5 kD molecular weight of the homogeneously purified enzyme described in 
Example 5 matched the predicted translation product of the nucleic acid sequence 
5 (31.6 kD) if the encoded protein was translated beginning at the GTG codon 
located at nucleotides 299 - 301 (Figure 4), instead of the 5 ' ATG codon located 
at nucleotides 59 - 61. This possibility is not surprising in light of the fact that 
approximately 10% of prokaiyotic and eukaryotic gene products begin translation 
with a GTG start codon, rather than the usual ATG codon (Kozak, M., Microbiol 
10 Rev. 47:1-45 (1983); Kozak, M. J.Cell.Biol 108:229 (1989); Gold, L. et al. 
ATum.Rev.Microbiol 35:365-403 (1981)). The structural gene was identified to 
be 834 nucleotides in length, coding for a protein of 278 amino adds (31.6 kD) 
and is set forth in SEQ ID NO: 4. It is also interesting to note that the CvOl gene 
was shown to possess an in-frame, upstream ATG codon which if translated could 
15 yield a protein with a predicted molecular weight of 41.4 kD (Figure 4). A larger 
molecular weight form possessing CWn restriction activity has not been detected 
by SDS gel electrophoresis. However, a second peak of CvOl activity which 
eluted separately from the 32.5 kD form was detected in the initial stages of 
enzyme purification. The DNA sequence which could theoreticaUy code for a 
20 larger form of CV/JI would be approximately 1074 nucleotides in length (assuming 
it starts at the upstream ATG codon) and would code for a protein of 358 amino 
acids. 

Example 5 

Purification of Recombinant CviJI Restriction Endonuclease 



25 



InitiaUy, 20 ml of LB medium (plus 100 /ig/ml ampicillin) were 
inoculated with a 1 ml stock of E. coli transformed with the plasmid pCJHl.4 
described above and grown overnight at 37°C with shaking. The next day, 20 ml 
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of this initial overnight culture was used to inoculate another 1 liter of LB 
medium and grown overnight. The foUowing day, 50 Uten of TO medium (12 
g Bacto-Tryptone, 24 g Bacto Yeast Extract, 4 ml glycerol, 2.31 g KH2PO4, 
12.54 g K2HPO4, 0.1 g MgS04, Mg/ml ampicillin, and water to 1 liter) were 
inoculated with an aUquot of the secondary overnight culture and grown at 37°C 
with 20 liters/min aeration at 200 RPM, until the ODjpjjj^ reached 1.0 unit. 
Vigorous aeration was essential for CviU expression and a typical yield contained 
70 g of cell paste after centrifugation. 

The cell pellet was immediately resuspended in lysis buffer A 
(30 mM Tris-HCl, pH 7.9 at 4°C, 2 mM EDTA, 10 mM beta-mercaptoetiianol, 
50 /ig/mlphenylmethylsulfonyl fluoride (PMSF), 20 ^g/ml benzamidine, 2 /xg/ml 
O-phenantroline, 0.7 /tg/nU pepstatin) at a volume of 3 ml of buffer A per 1 g of 
cells. The cell suspension was then passed through a Manton-Gaulin cell 
disrupter (Gaulin Corporation, Everett, MA) twice and centrifuged for 1 hr (8000 
RPM, Sorvall GS3 Rotor) at 4*'C. To tiie supernatant, solid NaCl was added to 
a final concentration of 200 mM, and 10% polyetiiyleneimine (PEI) solution 
slowly added to a final concentration of 1 %. The mixture was stirred for 3 hr, 
and then centrifuged 30 min, at 4''C, 8000 RPM (SorvaD GS3 Rotor). SoUd 
ammonium sulfate was tiien added to the supernatant at 0.5 g/ml and the mixture 
was stirred overnight at 4*'C. The precipitated proteins were centrifuged for 1 hr. 
(8000 RPM, Sorvall GS3 Rotor) at 4*'C and tiie resulting pellet dissolved in 
100 ml of buffer B (10 mM K/PO4, pH 7.2, 0.5 mM EDTA, 10 mM beta- 
mercaptoethanol, 50 mM NaCl, 10% glycerol, 0.05% Triton X-100, 50 ;xg/ml 
PMFS, 20 /ig/ml benzamidine, 2 ng/ml o-phenanUiroline, 0.7 /ig/ml pepstatin). 
The dissolved protein solution was Uien dialysed (14kD cut-off) for 12 hours 
against tiiree 1 liter changes of buffer B. The dialyzed solution was then diluted 
to 600 ml witii buffer B and applied to a 5 x 20 cm phosphocellulose Pll 
(Whatman) column (flow rate 100 ml/hr). 
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The column was then washed with 1.5 liter of buffer B followed 
by a 0 - 1.5 M NaCl gradient in buffer B (5 Uters). R.CvOI eluted at 
approximately 600 mlA NaCl. The active fractions were then pooled and 
concentrated to 50 ml with a 76 mm Amicon YMIO membrane. The resulting 
solution was then dUuted to 300 ml with buffer C (20 mM Tris-acetate, pH 7.4 
at 4 C, 2 mM EDTA, 10 mM beta-mercaptoethanol, 50 mM NaCl, 10% 
glycerol, 0.01 % Triton X-100, 50 /tg/ml PMFS, 20 /ig/ml benzamidine, 2 /xg/ml 
o-phenanthroline, 0.7 /tg/ml pepstatin) and appUed to 2.5 x 7 cm Heparin- 
Sepharose column at a flow rate of 25 ml/hr. 

After a 400 ml wash with buffer B, 'R.CvOI was eluted with a 
1.5 liter gradient of 0 - 1.3 M NaCl in buffer C. CvHI eluted at approximately 
400 mM NaCl. The most active fractions were pooled and appUed to a 
2.5 X 7 cm Blue-agarose column equiUbrated in buffer D (20 mM Tris-acetate pH 
8.0, 1 mM EDTA, 7 mM beta-mercaptoethanol, 30 mM NaQ, 10% glycerol, 
0.01% Triton X-100, 50/ig/ml PMFS, 20Aig/ml benzamidine, 2 /xg/ml 
o-phenanthroline, 0.7 /xg/ml pepstatin). After a 500 ml wash with buffer D, CviJl 
was eluted with a 0 - 1.5 M NaCl gradient (1.5 1) in buffer D. Active fractions 
were dialyzed against buffer G (10 mM K/P04 pH 7.0 (4°Q, 10 mM beta- 
mercaptoethanol, 50 mM NaCl, 10% glycerol, 0.01% Triton X-100, 50 /xg/ml 
PMFS, 20 /xg/ml benzamidine, 2 /xg/ml o-phenanthioline, 0.7 /xg/ml pepstatin) 
and loaded (20 myh) onto a ceramic HTP column (American International 
Chemical, Natick MA) (1.5 x 3 cm), equilibrated in buffer F (20 mM Tris-HCl 
pH 8.0, 0.5 mM EDTA, 3 mM DTT, 50 mM K-acetate, 5 mM Mg acetate, 50% 
glycerol). After washing with 100 ml of buffer F, a 400 ml gradient 0 - 0.9 M 
K/PO4 in buffer F was run. The HTP column was washed with buffer G, 
containing 3 mg/ml BSA, then with 1 M phosphate buffer and reequiUbrated in 
buffer G. TTie active fractions were then pooled and concentrated using a TMIO 
membrane to a final volume of 3 - 4 ml. This concentrate was then applied to a 
2.5 X 95 cm Sephadex G-lOO column, equiUbrated in buffer E (20 mM Tris-HCl 
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pH 7.5 (4®C), 5 mM Mg-Acetate, 2 mM EDTA, 10 mM beta-mercaptoethanol, 
100 mM NaCl, 5% glycerol, 0.01% Triton X-100, 50 /ig/ml PMFS, 20 Mg/ml 
benzamidine, 2 /ig/ml o-phenanthroline, 0.7 /ig/ml pepstatin) at a flow rate of 
6 ml/hr, and 3 ml fractions collected. Active fractions were dialyzed against 
5 storage buffer F. 

The molecular weight of the purified CWJI was determined by 
comparison to known protein standards on a denaturing 10% SDS polyacrylamide 
gel and a single band migrating with an apparent molecular weight of 32.5 
kilodaltons was seen indicating that by these criteria, CvOl was purified to 
10 homogeneity. 

Example 6 

N-Terminal Amino Acid Sequence of R.CvOI 

To confirm that the restriction endonuclease encoded by the insert 
in pCJHl.4 was CvOI the sequence of the first 15 N-terminal amino acids of 
15 purified CwJI was determined by the Edman degradation method using an Applied 
Biosystems (Foster City, CA) 477A Liquid Phase Protein Sequencer with an on- 
line 120A PTH Analyzer. The results of that analysis are shown in Table 1. 
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Table 1 

N-Tennlnal Amino Acid Analysis of CviJI 



Amino 
Acid# 


Retention 
Time 

^ Uli.it J 


pmol 
(Raw) 


Pmol 
C-bkgd) 


Pmol 
(+lag) 


Pmol 
Ratio 


Amino Acid ID 


1 


Q 17 


A 1 1 

O. ii 


J. 00 


5.10 


34.53 


THR, MET, 
ARG, OR LYS 


2 


10.32 


3.92 


1.54 


1.82 


9.96 


GLU 


3 


10.33 


4.28 


2.22 


2.18 


11.96 


GLU 


4 


27.37 


2.23 


1.49 


1.72 


7.64 


LYS 


5 


27.35 


2.37 


1.66 


1.67 


7.39 


LYS 


6 


17.95 


3.37 


2.76 


2.81 


9.48 


ARR 


7 


28.10 


3.19 


1.73 


2.08 


6.09 


LEU 


8 


13.58 


3.58 


2.11 


2.49 


12.08 


ALA 


9 


28.10 


3.23 


1.68 


1.58 


4.63 


LEU 


10 


18.17 


0.71 


0.78 


0.36 


1.21 


ILE 


11 


10.30 


1.65 


0.78 


0.96 


5.26 


GLU 


12 


9.72 


8.03 


0.41 


1.31 


3.25 


LYS 


13 


8.53 


1.54 


0.53 


0.55 


2.97 


GLN 


14 


18.18 


2.19 


1.74 


1.67 


5.63 


ARG 


15 


26.80 


3.33 


0.43 




0.89 


ILE 



Abbreviations used: tiireonine (THR). methionine (MEI^, arginine (ARG), lysine 
glulSiJfe"(S5) ^"''^^ ^"^'^ PLE) and 



The results of this analysis confirm that the protein encoded by the 
DNA insert in pCJHl.4 (ORFl) is CvOI. 
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The following Examples illustrate some of the unique properties of 
and important uses for CviJI. 

Example 7 
Analysis of CviJI* Recognition Sequences 

The CWJI* recognition sequence (see Xia, et al, Nuc.Acids Res. 
15: 6025-6090, 1987) was deduced by cloning and sequencing CViII* digested 
pUC19 DNA fragments. A complete CViJI* digest of pUC19 was ligated to an 
MlSmplS cloning derivative for nucleotide sequence analysis. The sequence of 
the entire insert was read in order to determine which sites were or were not 
utiUzed. A total of 100 clones were sequenced, resulting in 200 CvOT* restricted 
junctions, the data for which are compiled in Table 2. 
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The dinucleotide GC is found at 205 sites in pUC19. These GC 
sites (shown in Table 2) can be divided into four classes based on their flanking 
Pu/Py structure, the normal recognition sequence (N) and three potential classes 
of relaxed sites (R2 and R3). As seen in Table 2, the fraction of such NGCN 
sites which belong to each classification is roughly equal (22.0%-27.8%). A total 
of 200 CwJI restricted junctions were analyzed by sequencing 100 cloned inserts. 
If CViJI cleaved at all NGCN sites without sequence preferences, it would be 
expected that the fraction of each classification should be restricted approximately 
equally. Instead, most of the sites cleaved by this treatment were found to be 
normal, or PuGCPy sites (47.5%). Rl (PyGCPy) and R2 (PuGCPu) restricted 
sites were found at nearly the same frequency (25.5% and 27.0%, respectively). 
Out of 200 CVm junctions, no R3 (PyGCPu) restricted sites were found. Thus, 
CWJI cleaves aU NGCN sites except for PyGCPu. As CvOl* cleaves 12 out of 
16 possible NGCN sites, it may be referred to as a 2.25-base recognition 
endonuclease. 

In addition to the restricted sites, those sites which were not cleaved 
by Cvill conditions were also compiled for analysis, as shown in Table 2. A 
total of 116 non-cleaved NGCN sites were found in the 100 inserts which were 
sequenced. PyGCPu sites represented the largest class of non-cleaved sites 
(52.6%). In only two cases were PuGCPy sites found not to be cleaved. An 
approximately equal fraction of Rl and R2 sites were not cleaved as were found 
cleaved (22.4% versus 25.5% for Rl and 23.3% versus 27.0% for R2). Based 
on the frequency of cleavage, or lack thereof, a hierarchy of restriction under 
CviJl conditions is evident, where PuGCPy > > PuGCPu = PyGCPy. 



wo 94/21663 



PCT/US94/03246 



-41- 
Example 8 

CviJI* Restriction Generated Oligonucleotides 

Due to the high frequency of CvOl or Cvin* restriction, it is 
possible to generate useful oligonucleotides by digestion and a heat denaturation 
step as described above. The size and number of the resulting oligonucleotides 
are important for subsequent applications such as tfiose described above. If for 
example, an oUgonucleodde is to be used with a large genome, it has to be long 
enough so that the sequence detected has a probabDity of occuring only once in 
the genome. This minimum length has been calculated to be 17 nucleotides for 
the human genome (Thomas, C.A., Jr. Prog. Nucl Acid Res. Mol. Biol. 5:315 
(1966)). Oligonucleotides used for sequencing or PGR amplification are generaUy 
17-24 bases in length. Oligomers of shorter length will often bind at multiple 
positions, even with smaU genomes, and thus will generate spurious extension 
products. Thus, an enzymatic method for generating oUgomers should ideally 
result in polymen greater than 18 bases in length. 

The theoretical number of pUC19 CwTI* restriction-generated 
oligomers is 314 (157 CwTI* restriction fragments x 2 oligomers/fragment), the 
size distribution of which is shown in panel A of Figure 5. Most of the expected 
Cvin. restriction-generated oligomers (about 75%) are smaUer than 20 bp. This 
assumes that CWn is capable of restricting DNA to very small fragments, the 
shortest of which would be 2 bp. However, in practice, about 93 % of the cloned 
CVm* fragments were 20-56 bp in size, and 3 95 of the fragments generated by 
cm* were smaUer than 20 bp (panel B of Figure 5). This suggests that CVin* 
is not able to bind or restrict those fragments below a certain threshold length. 
Since the smallest observed fragment is 18 bp, it may be assumed that this length 
is the minimal size which can be generated from a given larger fragment. 
Whatever the reason for this phenomenon, CwTI* treatment of DNA produces a 
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relatively small range of oligomers (mosfly 20-60 bases in length), most of which 
are a perfect size class for molecular biology applications. 

Example 9 
Anonymous Primer Cloning 

Primers are critical tools in many molecular biology applications 
such as PGR, sequencing, and as probes. Anonymous primers are useful as 
sequencing primers for genomic sequencing projects, as probes for mapping 
chromosomes, or to generate oligonucleotides for PGR amplification. 

The Anonymous Primer Cloning (APC) method is a variation of 
shotgun cloning in that unknown sequences of DNA are being randomly cloned. 
However, unlike CWn shotgun cloning, wherein a partial CvOI** digest of DNA 
is cloned, anonymous primer cloning utilizes a complete CwH* digest to restrict 
large DNAs into small fragments 20-200 bp in size. These smaU fragments are 
cloned into a unique vector designed for excising the anonymous DNA as labeled 
primers. The strategy for this method is illustrated in Figure 6. 

As illustrated in Figure 6, the APC strategy reduces large DNAs 
to small fragments, which are cloned and excised for use as primers. Plasmid 
pFEM has a unique arrangement of the restriction sites for MboJl and Fokl, which 
permits DNA cloned into the EcoRV site to be excised without associated vector 
DNA. This is possible because FokL cleaves 9/13 bases to the left of the 
recognition site shown in pFEM and MboJl cleaves 8/7 bases to the right of the 
recognition site shown in pFEM, which is weU into the cloned anonymous 
sequence. After MboE or Fokl restriction, a known flanking primer is annealed 
(primer 1 or 2) and extended using a DNA polymerase and dNTPs. Thp. nrimpr 
is previously end-labeled, or alternatively, one or more 
radioactive. 
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After denaturation of the newly synthesized DNA and separation 
from its cognate template, the labeled anonymous primer is ready for use in 
sequencing the original template from which it was subcloned. The presence of 
the pFEM vector sequence fused to the anonymous sequence does not influence 
the enzymatic extension of this primer from its unique binding site, as the vector 
DNA is at the 5' end and the unique sequence is located at the 3' end (all 
polymerases extend 5' to 3'). Both the top and bottom strand primers may be 
excised from pFEM due to the symmetrical placement of restriction sites and 
flanking primer binding sites. Thus, two primers may be derived from each 
cloning event. APC is particularly weU suited to the genomic sequencing strategy 
of Church and GUbert Proc Natl. Acad Sci. USA 81:1991-1995 (1984), although 
its utility is not limited thereto. 

Example 10 

End Labeling of Restriction-Generated Oligonucleotides 

As is clear from the foregoing examples, digesting DNA with 
Cvm provides the ability to generate sequence-specific oligonucleotides ranging 
in size from 20-200 bases in length with an average length of 20-60 bases. 
Sequence specific oligonucleotides generated by CvOl* digestion may be labeled 
direcdy at ti»e 5'-end or at the 3'-end using techniques well known in that art. 

For example, 5'-end labeling may be accomplished by either a 
forward reaction or an exchange reaction using the enzyme T4 polynucleotide 
kinase. In the forward reaction, 32p from ['^h]ATP is added to a 5' end of an 
oUgonucleotide which has been dq)hosphorylated witii alkaline phosphatase using 
standard techniques widely known in tiie art and described in detail in Sambrook 
et al. Molecular Cloning: A Laboratory Manual. 2nd Edition. Cold Spring 
Harbor Laboratory Press (1989). In an exchange reaction, an excess of ADP 
(adenosine diphosphate) is used to drive an exchange of a S'-terminal phosphate 
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from the sequence specific oligonucleotide to ADP which is followed by the 
transfer of 32p from y'^p.^-jp ^ 5,^^^^ oUgonucleotide. This 

reaction is also catalyzed by T4 polynucleotide kinase and is decribed in 
Sambrook ei aL, Molecular Cloning: A Laboraiory Manual, 2nd Edition, Cold 
5 Spring Harbor Laboratory Press (1989). 

Homopolymeric tailing is another standard labeling technique useful 
m the labeling of CWJI -generated sequence specific oligonucleotides. This 
reaction involves the addition of ^^P-labeled nucleotides to the 3'-end of the 
sequence specific oligonucleotides using a terminal deoxynucleotide transferase. 
0 (Sambrook et al , Molecular Cloning: A Laboraiory Manual 2nd Edition, Cold 
Spring Harbor Laboratory Press (1989)). 

Commonly used labeling techniques typically employ a single 
oligonucleotide directed to a single site on the target DNA and containing one or 
a few labels. Oligonucleotides generated by the method of the present invention 
5 are directed to many sites of a target DNA by virtue of the fact that they are 
generated from a sample of the target sequence. Thus, the hybridization of 
multiple oligonucleotides (labeled by the methods described above) allows a 
significantly enhanced sensitivity in the detection of target sequences. In addition, 
the short length of the labeled oligonucleotides used in the methods of the present 
) invention allows a reduction in hybridization time from overnight (as is used in 
conventional methods) to 60 mins. 

Although labeling sequence specific oligonucleotides with •'^P is 
described above, labeling with other radionucleotides, and non-radioactive labels 
is also within the scope of the present invention. 



wo 94/21663 



PCT/US94/03246 



-45- 
Example 11 
Primer Extension Labeling of DNA Using 
Restriction-Generated Oligonucleotides (PEL-RGO) 

Another aspect of the present invention includes methods for 
labeling DNA which include the generation of oUgonucleotide primers by 
complete digestion with CvOf, followed by heat denaturation. PEL-RGO 
requires three steps: 1) generating the sequence-specific oligonucleotides by CVOT* 
restriction of Uie template DNA; 2) denaturation of the template and primer; and 
3) primer extension in the presence of labeled nucleotide triphosphates. Plasmid 
DNA may be prepared by methods known in the art such as the alkaline lysis or 
rapid boiling methods (Sambrook et al.. Molecular Cloning: A Laboratory 
Manual. 2nd Edition). Cold Spring Haibor Laboratory Press, Cold Spring 
Harbor, New York (1989)). In addition, die vector should be linearized to ensure 
effective draatiiration. A restriction fragment may be labeled after separation on 
low mdting point agarose gels by meftods weU known in the art. 

In PEL-RGO labeling, template DNA to be labeled is divided into 
two aUquots; one is used to generate the sequence specific oUgonucleotide primers 
and the other aliquot is saved for the primer annealing and extension reaction. 
A typical reaction mix for generating sequence-specific oligonucleotides is 
assembled in a microcentrifuge tube and includes: 100 ng DNA; 2 yX 5x CWn* 
buffer; 0.5 /xl Cv/JI (lu/^l); sterile distilled water to 10 /tl final volume. Cvisf 
5X restriction buffer includes: 100 mM glycylglycine (Sigma, St. Louis, 
Missouri, Cat. No. G2265) pH adjusted to 8.5 with KOH, 50 mM magnesium 
acetate (Amresco, Solon, Ohio, Cat. No. P0013119), 35 mM /S-mercaptoetiianol 
(MaUinckrodt, Paris, Kentucky, Cat. No. 60-24-2), 5 mM ATP, 100 mM 
diUiiothrdtol (Sigma, St. Lous, Missouri, Cat. No. D9779) and 25% v/v DMSO, 
(MaUinckrodt Cat. No. 67-68-5). CvOI is obtained from CHIMERx (Madison, 
Wisconsin), The reaction mix is incubated at 37°C for 30 min, followed by the 
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inactivation of CvOl by heating at 65°C for 10 min. The CwJl*-restricted DNA 
may be used directly without further purification, or it may be stored at -20° C for 
several months for subsequent labeling reactions. 

After heat-inactivating CvSl, 0.2 ng of the digested and undigested 
DNA are electrophoresed on a 1.5% agarose gel, using a suitable molecular 
weight marker for comparison. The CwJI restiiction fragments appear as a low 
molecular weight smear in the 20-200 bp range. 

By way of example, 1-10 ng of linearized pUC19 was labeled under 
the conditions described below. A template-primer cocktail was prepared by 
mixing 10 ng of linearized pUC19 DNA template with 20 ng pUC19 sequence- 
specific oligonucleotides (prepared as described above) and tiie mixture is brought 
to a final volume of 17 ^il with sterile distilled water. The template-primer 
mixture is denatured in a boiling water bath for 2 minutes and immediately placed 
on ice. 

The following labeling mixture is then added to the template-primer 
mix:2.5 (tl lOX labeling buffer (500 mM Tris HCl at pH 9.0, 30 mM MgCl2, 
200 mM (NH4)2S04, 20mM dATP, 20/iM dTTP, 20/tM dGTP, 0.4% NP-40); 
5.0 /il [a-32p] dCTP (3000Ci/mmol, lOfiCUfil New England Nuclear, Catalog 
No. NEG013H); 0.5 /zl Themm flavus DNA polymerase (5u//xl) (Molecular 
Biology Resources, Milwaukee, Wisconsin); up to 25 /U final volume with 
distilled water. The reaction was incubated at TO^C for 30 min and tiien stopped 
by adding 2^1 of 0.5M EDTA at pH 8.0 to the reaction mix. 

The efficiency of the labeling reaction is gauged by the percentage 
of radioisotope incorporated into labeled DNA. One microliter of the labeling 
reaction is added to 99 n\ of lOmM EDTA in a microcentrifuge tube. This serves 
as the source of diluted probe for total and trichloroacetic acid (TCA)-precipitable 
counts. 2 /il of diluted probe is spotted onto the center of a glass fiber filter disc 
(Whatman number 934-AH). The disc is then allowed to dry and is then placed 
in a vial containing scintillation cocktail for counting total radioactivity in a liquid 
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scintUlation counter. Another 2 /il aliquot from the dUuted probe is added to 1 
ml of 10S6 ice cold TCA followed by the addition of 2 ^1 of earner bovine serum 
albumin (BSA). This mixture was then placed on ice for 10 minutes. The 
precipitate is then collected on a glass fflter disc (Whatman No. 934-AH) by 
vacuum filtration. The filter is then washed with 20ml of ice cold 10% TCA, 
allowed to dry and is placed in a vial containing scintillation cocktail and counted. 

Because primer extension oligonucleotide labeling results in net 
DNA synthesis, the specific activity of labeled DNA is calculated using the 
following guidelines. 

Total cpm incorporated = TCA cpm X 50 X 27 

Wherein the factor 50 is derived from using 2 /J of a 1:100 dUution for TCA 
precipitation. The number 27 converts this back to the total reaction volume 
(which is the reaction volume plus 2 iil of stop solution). 



Synthesized DNA (ng of DNA synthesized) = 
theoretical yield X fraction of radioactivity incorporated. 

Theoretical yield (ng of DNA)= uCi dNTP. ^m^a x 4 X ^^On^/.n^^ u 

specific activity dNIP((3/mmole=/iCi/nmole) 

Fraction of incorporated label = TCA precipitated cpm/ total cpm. 

Specific activity (cpm//xg of DNA) = total com inr.nrporated x innn 

synthesized DNA + input DNA 

Wherein 1000 is the fector converting nanograms to micrograms. 
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By way of example, the following represents the calculation of 
specific activity for an aliquot of pUC19 DNA labeled using this method. Using 
50 fiCi of [a- 32p]dCTP in a 25 ^1 reaction, and if the TCA precipitated cpm is 
26192 and total cpm is 102047; 



Total cpm incoiporated = 26192 X 50 X 27 =3.27 x lo'^cpm 
Synthesized DNA (ng of DNA synthesized) = 
Theoretical yield X fraction of radioactivity incorporated. 



Theoretical yield = uCi of dNTPx t 4 y ^;^n 

3000 /iCi/nmole 

= 50 liCi T 4 Y -^^n 
3000 

= 22ng 

Fraction of label incorporated = TCA precipitatPri = 26192 = 0.256 

Total cpm 102047 



15 Synthesized DNA = 22 X 0.256 

= 5.6 ng 



Specific activity (cpm /ue)= Total com ingnrpnrafP/< x 1000 

Synthesized DNA +input DNA 

Input DNA = 10 ng 

20 Specific activity = 3.27 x IO^t lOOO 

5.6+10 
=2.09 X 10^ cpm/;xg 



Unincorporated radioactive label may be removed using standard 
methods well known in the art. 
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Comparisons were made between PEL-RGO vs RPL under similar 
conditions, and it was observed that a detection limit of 100 fg was seen using 
PEL-RGO labeled DNA compared to a detection limit of 500 fg with RPL, using 
a radiolabeled probe. 

^ Example 12 

Thermal Cycle Labeling and Universal Iliennal Cycle Labeling 

Thermal Cycle LabeUng (TCL) is a method according to the present 
invention for efficiently labeling double-stranded DNA whUe simultaneously 
amplifying large amounts of the labeled probe. TCL of DNA requires two 
10 general steps: 1) generation of the sequence-specific oUgonucleotides by CviJl" 
restriction of the template DNA; and 2) repeated cycles of denaturation. 
annealing, and extension in the presence of a thermostable DNA polymerase or 
a functional fragment thereof which maintains polymeiase activity. Optimal 
results are obtained after 20 such cycles, which is best performed in an automated 
thermal cycling instrument such as a Perkin-Hmer Model 480 thermocycler. In 
conjunction with such an instrument, about 1.5 hr. is required to complete this 
protocol. If a thermal cycler is not available these reactions may be performed 
using heat blocks. As few as 5 cycles may yield probes with acceptable detection 
sensitivities. The generation of sequence specific oUgonucleotides for use in this 
20 method may also be accomplished using the restriction endonuclease reagent 
CGase I described in Example 20 or the restriction endonuclease Aci I which has 
as a recognition sequence CCGC. 

Non-radioactive labeling of DNA using TCL is accomplished by 
mixing: 10 pg - 100 ng linearized template, 50 ng C7wn*-digested primers 
(prepared as described above), 1.5 ^1 lOX labeling buffer, 0.5 nl Thermus flavus 
DNA polymerase (5u//xl) (Molecular Biology Resources, Inc., Milwaukee, 



15 



25 
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Wisconsin), 1 ^1 of ImM Biotin-ll-dUTP (Enzo Diagnostics, New York, New 
York), 1.5 n\ each of dATP, dCTP, and dGTP (2 mM), and 1.0 ^1 2niM dlTP. 

Radioactive labeling of DNA using TCL was accomplished by 
mixing 10 pg - 100 ng of Cvfll generated primers, 10 pg-25 ng of linearized 
template, 1.5 /xl of lOX labeling buffer, 5 /J of 32p-dCTP (3000 Ci/mmole, 10 
nCi/nl or 40 tiCUfil), 0.5 ^1 of Thermus flavus DNA polymerase (5u/ftl), and 0.5 
Ml each of dATP, dGTP, and dTTP (1 mM) was added. The reaction mix was 
brought to a volume of 15 ^1 with deionized H2O, overlaid with mineral oil and 
cycled through 20 rounds of denaturation, annealing and extension. A typical 
cycling regimen employed 20 cycles of denaturation at 91**C for 5 sec, annealing 
at 50'*C for 5 sec and extension at 72**C for 30 sec. The reaction is then 
terminated by adding 1 ^1 of 0.5M EDTA, pH 8.0. The amplified, labeled probe 
is a very heterogeneous mixture of fragments, which appears as a smear when 
analyzed by agarose gel electrophoresis. 

Universal thermal cycle labeling (UTCL) is a method according to 
the present invention for efficiently labeling double-stranded DNA while 
simultaneously amplifying large amounts of labeled probe. UTCL is unique in that 
no sequence information is required regarding the template. The extension 
primers are suppled endogenously via the holo-enzyme of the thermostable DNA 
polymerase and any anonymous DNA template can be labeled by repeated cycles 
of denaturation, annealing, and extension in the presence of a labeled 
deoxynucleotide triphosphate. Optimal results are obtained after 20 such cycles, 
which is best performed in an automated thermal cycUng instrument such as a 
Perkin-Elmer Model 480 thermocycler. In conjunction with such an instrument, 
about 1.5 hr are required to complete this protocol. If a thermal cycler is not 
available these reactions may be performed using heat blocks. As a few as 5 
cycles may yield probes with acceptable detection sensitivies. 

Non-radioactive labeling of DNA using UTCL is accomplished by 
mixing: 10 ng linearized template, 1.5 ^1 lOX labeling buffer, 0.5 m1 Thermus 
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flavus DNA polymerase (5u//il) (Molecular Biology Resources, Inc., MUwaukee, 
Wisconsin), 1 fil of ImM Biotin-ll-dUTP (Enzo Diagnostics, New York, New 
York), 1.5 /xl each of dATP, dCTP, and dGTP (2 niM), and 1.0 /il 2mM dTTP. 

Radioactive labeling of DNA using UTCL was accomplished by 
mixing: 10 pg-100 ng of linearized template, 1.5 (il of lOX labeling buffer, 5 /J 
of 32p^CTP (3000 Ci/mmole, 10 mCI/^I or 40 fiCyfd), 0.5 /xl of Themms flavus 
DNA polymerase (5u/m1), and 0.5 /tl each of dATP, dGTP, and dTTP (1 mM) 
was added. The reaction mix was brought to a volume of 15 /xl with deionized 
H2O, overlaid with mineral oil and cycled through 20 rounds of denaturation, 
annealing and extension. A typical cycling regimen employed 20 cycles of 
denaturation at Pl^C for 5 sec, annealing at 50°C for 5 sec and extension at 12°C 
for 30 sec. The reaction is then terminated by adding 1 /xl of 0.5M EDTA, pH 
8.0. The amplified, labeled probe is a very heterogeneous mixture of fragments, 
which appears as a smear when analyzed by agarose gd electrophoresis. 

Estimatinn of pio-11 dTITP incorpoTarinp; 
In order to estimate the level of incorporation of biotin-ll-dUTP 
into DNA, a serial dilution from 1:10 to 1:10^ of the labeled probe (free of 
unincorporated biotin-1 1-dUTP) is made in TE (lOmM Tris, ImM EDTA, pH 8). 
A microliter of each dUution is placed on a neutral nylon membrane, and the 
DNA sample is bound to the membrane either by UV cross linking for 3 min or 
by baking at 80°C for 2 hr. 

The unbound sites on the membrane are blocked using a blocking 
buffer for 15 min at 25*'C. Streptavidin-alkaline phosphatase (Gibco-BRL 
Gaithersburg, Maryland, Cat. No. 9545A) is added to the blocking buffer (0.058 
M Na2HP04, 0.017 M NaH2P04, 0.068 M NaCl, 0.02% sodium azide, 0.5% 
casein hydrolysate, 0.1% Tween-20) at a 1:5000 dUution and incubated for a 30 
min., and the membrane is rinsed 3 times for 10 min. each with wash buffer (Ix 
PBS [0.058 M Na2HP04, 0.017 M NaH2P04, 0.068 M NaCl], 0.3% Tween, 
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0.2% sodium azide), rinsed briefly (5 minutes) with AP buffer (100 mM NaCl, 
5 mM MgCl2. 100 mM Tris-Cl pH 9.5) and then enough AP buffer containing 
4.0 fiUtDl nitro blue tetrazoUum (NBT) (Sigma Cat. No. N6639), (Sigma Cat. No. 
B6777), and 3.5 /tl/ml of 5-bromo-4-chloro-3-indolyl phosphate (BCIP) was added 
in order to cover the membrane. The membrane is left in the dark for 
approximately 30 minutes or until the reaction is complete. The reaction is 
stopped by rinsing in 1 X PBS. 



Detection Sen^titivirip^ 

32 

P-labeled probes generated by the protocol above described 
labelling detect as Uttle as 25 zeptomoles (2.5 x lO'^O moles) of a target 
sequence. As litUe as 10 pg of template DNA is enough to synthesize 5-10 ng of 
radiolabeled probe, which is sufficient for screening 5 Southern blots. The 
radioactive versions of TCL and UTCL facilitate extremely high specific activities 
of labeled probe (about 5 x 10^ cpm/^g DNA), which permits 5-10 fold lower 
detection limits than conventional labeling protocols. The synthesis of higher 
specific activity probes is probably the net result of the sequence-specific 
oligonucleotide primers and their increased length when compared to the short 
random primers used in other labeling methods. In addition, the thermal cycling 
permits probe amplification. 

Biotin-labeled probes generated by the TCL and UTCL protocols 
detect as little as 25 zeptomoles (2.5 x 10-20 moles) of a target sequence. A 15 
Ml TCL or UTCL reaction yields as much as 5-10 ng of labeled DNA, enough to 
probe 5 to 10 Southern blots. Biotin-labeled TCL and UTCL probes provide a 
10 fold greater detection sensitivity when compared to RPL biotin probes. In 
addition, the thermal cycling permits probe amplification. 

Non-radioactive, biotinylated probes labeled by the TCL and UTCL 
methods were shown to have detection limits that are identical to the radioactive 
probes. These methods have the advantage of eliminating the need to work with 
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hazardous radioactive materials without sacrificing sensitivity. In addition, results 
are obtained from non-isotopic probes in 3-4 hours compafed to 3-4 days for 
radiolabeled probes. The ability to substitute non-radioactive probes for 
radioactive probes may be very useful to clinical laboratories, which do not use 
radioisotopes but do need greater detection sensitivities. Research laboratories 
favor the use of non-isotopic systems if detection sensitivity is not an issue. The 
non-isotopic labeling version of the TCL and UTCL systems represent a major 
improvement in labeling DNA probes. Non-radioactive probes generated by the 
methods of the present invention are also useful in the detection of RNA in situ. 
An advantage of this system is that labeling protocols of the present invention 
yield highly sensitive non-radioacdve probes, and the size of the probes are 
predominanUy in the smaU molecular weight range and can therefore penetrate the 
tissue easUy, unlike RPL. Because non-radioactive probes labeled using the 
labeling protocols of the present invention have the same detection limits as do 
radioactive probes similarly labeled, it is within the scope of this invention to use 
either radioactive or non-iadioactive probes for probing, for example, Southern 
blots. Northern blots, for in situ hybridization for the detection of mRNA or DNA 
in cells or tissue directly, and for colony or plaque lifts. 



Example 13 
Quasi-Random Fragmentation of DNA 



Shotgun cloning and sequencing requires the generation of an 
overlapping population of DNA fragments. Therefore, conditions were 
estabHshed for the partial digestion of DNA with CwH to produce an apparentiy 
random pattern, or smear, of fragments in the appropriate size range. 
Conventional methods for obtaining partially restricted DNA include limiting the 
incubation time or limiting the amount of enzyme used in the digestion. Initially, 



94/21663 



PCT/US94/03246 



-54- 

agarose gel electrophoresis and ethidium bromide staining of the treated DNA 
were utilized to assess the randomness and size distribution of the fragments. 

CvOI was obtained from CHIMERx (Madison, Wisconsin). 
Digestion of pUC19 DNA for limited time periods, or with limiting amounts of 
CvOl under normal or relaxed conditions, did not produce a quasi-random 
restriction pattern, or smear. Instead, a number of discrete bands were observed, 
as shown in Figure 7, lane 3 for the CvOl* partial digestion of pUC19. Complete 
digests of pUC19 under normal and CvfJI* buffer conditions are shown in lanes 
1 and 2 respectively. These results show that, under these relaxed conditions, 
CViJI has a strong restriction site preference. 

To eliminate the apparent restriction site preferences observed 
under the partial restriction conditions described above, a series of altered reaction 
conditions were explored. Conditions of high pH, low ionic strength, addition of 
solvents such as glycerol or dimethylsulfoxide, and/or substitution of Mn^"*" for 
Mg2+ were systematically tested with CWJI endonuclease using the plasmid 
pUC19. Figure 7 shows the results of these tests. In Lane M, a 100 bp DNA 
ladder was run. In Lanes 1-4, pUC19 DNA (1.0 |xg) was run after digestion at 
37°C in a 20 ^1 volume for the following times and conditions: Lane 1, complete 
CV/JI digest (1 unit of enzyme for 90 min in 50 mM Tris-HCl, pH 8.0, 10 mM 
MgCl2, 50 mM NaCl); Lane 2, complete CV/JI* digest (1 unit of wizyme for 90 
min in 50 mM Tris-HCl, pH 8.0,10 mM MgCl2, ^0 mM NaCl, 1 mM ATP, 20 
mM DTI); Lane 3, partial CwH* digest (0.25 units of enzyme for 30 min in 50 
mM Tris-HCl, pH 8.0, 10 mM MgClj, 50 mM NaCl, 1 mM ATP, 20 mM 
DTT); Lane 4, partial CVm** digest (0.5 units of enzyme for 60 min in 10 mM 
Tris-HCl, pH 8.0,10 mM MgClj, 10 mM NaCl, 1 mM ATP, 20 mM DTT, 20% 
v/v DMSO); and Lane 5, uncut pUC19 (1.0 ^tg). 

The digestion condition which yielded the best "smearing" pattern 
was obtained when the ionic strength of the relaxed reaction buffer was lowered 
and an organic solvent was added (Figure 7, lane 4). Plasmid pUC19 partiaUy 
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digested under these conditions yields a relatively non-discrete smear. This 
activity is referred to as CvOI** to differentiate it from the originally- 
characterized star activity described in Xia e/ a/., Nucl. Acids Res. 15:6075-6090 
(1987). The appearance of diffuse, faint bands overlying a bacl^round smear 
generated from this 2686 bp molecule indicates that some weakly preferred or 
resistant restriction sites may bias the results of subsequent cloning experiments. 

DNA was mechanically sheared by sonication utilizing a Heat 
Systems Ultrasonics (Farmingdale, New York) W-375 cup horn sonicator as 
specified by Bankier et al., Methods in Emymology 155:51-93 (1987). DNA 
fragmented by this method has random single-stranded overhanging ends (ragged 
ends). 

CViJI* digested, and sonicated samples were size fractionated by 
agarose gel electrophoresis and electroelution, or by spin columns packed with the 
size exclusion gel matrix, Sephacryl S-500 (Pharmacia LKB, Piscataway N.J.) to 
eliminate small DNA fragments. Spin columns (0.4 cm in diameter) were packed 
to a height of 1.3 cm by adding 1 ml of Sephacryl S-500 sluny and centrifuging 
at 2000 RPM for 5 minutes in a Beckman CPR centrifuge. The columns were 
rinsed 3 times with 1 ml aliquots of 100 mM Tris-HCl (pH 8.0) by centrifiigation 
at 2000 RPM for 2 min. Typically, 0.2-2.0 fig of fragmented DNA in a total 
volume of 30 /il was applied to the column. TTie void volume, containing those 
DNA fragments larger than 500 bp, was recovered in the column eluant after 
spinning at 2000 RPM for 5 minutes. Hie capacity of this micro-column 
procedure is 2 /xg of DNA. Agarose gel electrophoresis and electroelution are 
described in detail by Sambrook et al. Molecular Cloning: A Laboratory Manual, 
Second Edition Cold Spring Harbor Laboratory Press, Cold Spring Harbor N. Y. 
(1989) and is weU known to those skUled in the art. In these experiments, 5 fig 
of sample was pipetted into a 2 cm-wide slot on a 1% agarose gel. 
Electrophoresis was halted after tiie bromophenol blue tracking dye had migrated 
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6 cm. Fragments larger than 750 bp, as judged by molecular size markers, were 
separated from smaller sizes and electrophoresed onto dialysis tubing (1000 MW 
cutoff). The fractionated material was extracted with phenol-chloroform and 
precipitated using ice cold ethanol (50% final volume) and ammonium acetate (2.5 
M final concentration). 

The ragged ends of the sonicated DNA were rendered blunt 
utilizing two different end repair reactions. In one end repair reaction (ER 1) 
sonicated DNA was treated according to the procedure outlined by Bankier et a/. 
Methods in Enzymology 155:51-93 (1987), where 2.0 /ig of sonicated lambda 
DNA is combined with 10 units of the Klenow fragment of DNA polymerase I, 
10 units T4 DNA polymerase, 0.1 mM dNTPs, (deoxynucleotide 
triphosphates =deoxyadenosine triphosphate, deoxthymidine triphosphate, 
deoxycytosine triphosphate, and deoxyguanosine triphosphate) and reaction buffer 
(50 mM Tris-HCl, pH 7.5,10 mM Mga2, 10 mM DTT). This mixture was 
incubated at room temperature for 30 min followed by heat denaturation of the 
enzymes at 65°C for 15 minutes. In a second end repair reaction (ER 2), an 
excess of the reagents and enzymes described above were utilized to ensure a 
more efficient conversion to blunt ends. In this reaction, 0.2 fig of the sonicated 
lambda DNA sample was treated under the same reaction conditions described 
above. 

Figure 8 shows comparisons of the size distributions of sonicated 
DNA versus DNA that was partially digested with CVfll**. In Lanes M, a 1 kb 
DNA ladder was run. In Lanes 1-3, untreated X DNA (0.25 /ig), sonicated X 
DNA (1.0 (ig), and CVfll** partially-digested X DNA (LO fig) were run, 
respectively. In Lanes 4-6, untreated pUC19 (0.25 /xg), sonicated pUC19 (1.0 
Mg), and CVm** partially-digested pUC19 (1.0 /xg) were run, respectively. 

Fragmentation of a large substrate such as lambda DNA (45 kb) 
revealed essentially no banding differences between the CVOT"^* method and 
sonication, as demonstrated in Figure 8, lanes 2 and 3. In addition, pUC19 DNA 
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that was partially digested with Cm** gave a size distribution or "smear" that 
closely resembled that achieved with sonication (Figure 8, lanes 5 and 6). As 
expected, the minor bias evident with a small molecule such as pUClP was not 
detectable with a larger substrate such as lambda DNA. 

The intensity and duration of sonic treatment affects the size 
distribution of the resulting DNA fragments. The results obtained from the 
sonication of lambda and pUC19 samples (Figure 8) were obtained from three 20 
second pulses at a power setting of 60 watts. Sonication-generated smears are 
simiia-, although the size distribution of fragments is consistentiy greater with 
CvOl fragmentation. This result favors the cloning of larger inserts, which 
facilitates the efficiency of end-closure strategies (Edwards et al.. Genome 6:593- 
608 (1990)). The size distribution of the DNA fragmented by CvOf* is 
controlled by incubation time and amount of enzyme, variables which are readUy 
optimized by routine analysis. An excess of enzyme or a long incubation time 
will completely digest pUa9 DNA, resulting in fragments which range in size 
from approximately 20 bp to approximately 150 bp (Figure 7, lanes 1 and 2). 
The results shown in Figure 8 were obtained by incubating pUC19 for 40 minutes 
and lambda DNA for 60 minutes with 0.33 units of CviJI/;xg substrate. The 
efficiencies of tiie two methods for randomly fragmenting DNA were 
20 quantitatively analyzed for use in molecular cloning, as described below. 

Example 14 

Rapid DNA Size Fractionation Utilizing Spin Column Chromatography 

The amount of data obtained by the shotgun sequencing approach 
is substantially increased if fragments of less than 500 bp are eliminated prior to 
the cloning step. Small fragments yield only a portion of the sequence data which 
may be collected from polyacrylamide gel based separations and, thus, such smaU 
fragments lower the efficiency of this strategy. Agarose gel electrophoresis 
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foUowed by electroelution is commonly used to size fractionate DNA prior to 
shotgun cloning (Banlder et a!.. Methods in Enzymol 155:51-93 (1987)). 
Approximately three hours are required to prepare the agarose gel, electrophorese 
the sample, electroelute fragments larger than 500 bp, perform phenol-chloroform 
extractions, and precipitate the resulting material. 

The results of 5 out of 9 independent trials size-fractionating 
CViJI -fragmented lambda DNA by agarose gel electrophoresis are shown in 
Figures 9A-E. Figures 9A-D illustrate the following. In Figure 9A: LaneM, 
1 kb DNA ladder; lane X, untreated X DNA (0.25 ng); lane 1, unfractionated 
(UF) cm** partially-digested X DNA (1 .0 Mg); lane 2, column-fractionated (CF) 
cm** partially-digested X DNA (1.0 iig); lane 3, gel-fractionated (GF) Cm** 
partially-digested X DNA (1.0 Mg); and in Figures 9B-E are additional trials of the 
same treatments as in the lanes of Figure 9A which have the same label. 

Small DNA fragments may also be removed by passing the sample 
through a short column of Sephacryl S-500. Approximately 15 min. are needed 
to prepare the column and 5 min. to fractionate the DNA by this method. 

The results of three out of nine trials using a Sephacryl S-500 
column are shown in Figures 9A-C. The efficiency of eliminating small DNA 
fragments (<500 bp) by spin column chromatography appears high, and the 
reproducibility was excellent. This result is in contrast to the agarose gel 
electrophoresis and electroelution data presented in Figures 9A-E wherein nine 
repUcate trials of this method yielded nine differentiy sized products, regardless 
of the source of the agarose. Both methods yielded 30-4096 recoveries as 
measured by UV spectrophotometry. To quantitate the relative efficiencies of the 
two fractionation methods, the lambda DNA size fractionated in Figure 9A lanes 
2 and 3, and Figure 9B lane 3 were analyzed for cloning efficiency and insert 
size, as described below. 
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Example 15 
Cloning Effldendes of Gel Elution and 
Chromatography Fractionation Methods 

The efficacy of size selection was quantified by two criteria: 1) by 
comparing the relative cloning efficiency of CviJl** partially-digested lambda 
DNA fragments fractionated dther by agarose gd electrophoresis and 
electroelution or microcolumn chromatography, and 2) determining the size 
distribution of the resulting cloned inserts. To reduce potential variables, large 
quantities of the cloning vector and ligation cocktail were prepared, ligation 
reactions and transformation of competent E. coli were performed on the same 
day, numerous redundant controls were performed, and ail cloning experiments 
were repeated twice. Ligation reactions were carried out overnight at 12°C in 20 
Ml mixtures using the following conditions: 25 mM Tris-HCl (pH 7.8), 10 mM 
MgClj, 1 mM DTT, 1 mM ATP, DNA, and 2000 units of T4 DNA ligase. For 
15 unfractionated samples, 10 ng of fragments and 100 ng of flzndl-restricted, 
dephosphorylated pUC19 were combined under the above conditions. For 
Sephacryl S-500 fractionated samples, 50 ng of size-sdected fragments were 
ligated with 100 ng of »>icn-restricted, dephosphorylated pUC19. This increase 
in fractionated DNA was determined empiricaUy to compensate for the lower 
concentration of "ends" resulting from the fractionation procedure and/or the 
lowered effidency of cloning larger fragments. Ligation reaction products were 
added to competent^. co//DH5aF' (*80d/acZAM15 A(/flcZYA-flr^F)U169dcoR 
gyrASe recAl reiki endM thi-1 /trdR17(rK",mK+) supEM X-) in a 
transformation mixture as spedfied by the manufacturer (Life Technologies, 
Bethesda, Maryland) and aliquots of the transformation mixture were plated on 
T agar (Messing, Methods in Enzymol 101:20-78 (1983)) containing 20 ftg/nd 
ampicillin, 25 ^d of a 2% solution of isopropylUiiogalactoside (IPTG) and 25 ^1 
of a 2% solution of 5-dibromo-4-chloro-3-indolylgalactoside (X-GAL). The 
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cloning efficiencies reponed are the average of triplicate platings of each ligation 
reaction. The concentration of the fractionated material was checked 
spectrophotometricaUy so that 50 ng was added to all ligation reactions. This 
material was ligated to ^T/ncII-digested and dephosphoiylated pUC19. ITiis 
cloning vector was chosen because it permits a simple blue to white visual assay 
to indicate whether a DNA fragment was cloned (white) or not (blue) (Messing, 
Methods in Enzymol. 101:20-78 (1983)). 

A summary of the cloning efficiencies calculated from two 
independent trials is given in Table 3. 
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TABLE 3 

Cloning Efficiencies of CviJI** PartiaUy Digested Lambda DNA 
Fractionated by Microcohnnn Chromatograpby Versus Agarose Gel 
Electroelution. 

Trial I jrial n 



Colony Phennfypft 



10 



DNA/treatmenf 




White 


Blue 


WhitB 


Supercoiled pUC19 


55000 


<10 


50000 


<10 


pUC19/HincII/CIAP 


210 


<1 


320 


1 


pUC19/HincII/CIAP/ 


150 


4 


210 


7 


T4 DNA ligase 










X/CviJI** partial/CF 


140 


240 


210 


240 


+ pUC19 










X/CviJI** partial/GFEl 


98 


49 


200 


18 


+ pUC19 










X/CviJI** partiaI/GFE2 


82 


54 


95 


74 



+ pUC19 

Cloning efficiencies reflect the number of ampicillin-resistant 
colonies/ng pUC19 DNA. CIAP represents treatment with calf intestinal alkaline 
phosphatase used to dephosphorylate ff/ncH-digested pUC19 to minimize self- 
ligation. CF refers to DNA that was fractionated on Sephacryl S-500 columns as 
described above. GFEl and GFE2 refer to two runs wherein DNA was 
fractionated by agarose gel electrophoresis and electroduted. X refers to 
bacteriophage X DNA. 

These trials represent repeated experiments in which X DNA 
fragments generated by CwJI** partial digestion were ligated to a/jcH-linearized, 
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dephosphorylated pUC19 and transformed into DH5a F' competent ceUs described 
above. The first three rows in Table 2 show controls performed to establish a 
baseline to better evaluate the various treatments. Supercoiled pUC19 transforms 
E. coli 10 times more efficiently than the fT/ncn-digested plasmid and 150-260 
times more efficiently than the ff/ncn-digested and dephosphoiylated plasmid. 
The number of blue and white colonies which resulted from transforming HincU- 
cut and dephosphorylated pUC19 was determined both before and after treatment 
with T4 DNA ligase in order to differentiate these background events from 
cloning inserts. The background of blue colonies (which represent the uncut 
and/or non-dephosphorylated population of molecules) averaged 0.4%, compared 
to supercoiled plasmid. The background of white colonies (which presumably 
results from contaminating nucleases in the enzyme treatments or genomic DNA 
in the plasmid preparations) afier H/ndl-digestion, dephosphorylation, and Ugation 
of pUC19 averaged 0.014% as compared to the supercoiled plasmid. 

The number of white colonies obtained when micro-column 
fractionated DNA was cloned into pUC19 was 240/ng vector in both trials. The 
efficiency of cloning gel fractionated and electroeluted DNA ranged from 18-74 
white colonies/ng vector. The data show that column fractionated DNA results 
in three to thirteen times the number of white colonies, and presumably 
20 recombinant inserts, as gel fractionated and electroeluted DNA. The size 
distribution of the inserts present in these white colonies is depicted in Figures 
lOA-C. In Figure lOA, a CwJl" partial digest of 2iig of X DNA was size 
fractionated on a 4 mm by 13 mm column of Sephacryl S-500 at 2,000 x g for 5 
minutes. The void volume containing partially digested DNA was direcUy Ugated 
to linear, dephosphorylated pUC19 and 43 resulting clones were analyzed for 
insen size. The DNA for this experiment is the same as that shown in Figure 
9A, lane 2. In Figure lOB, a CvH** partial digest of 5 Mg of X DNA was size 
fractionated by agarose gel electroelution. The eluted DNA was phenol-extracted 
and ligated to linear, dephosphorylated pUC19, and the resulting 40 clones were 
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analyzed for insert size. The DNA for this experiment is the same as that shown 
in Figure 9A, lane 3. In Figure IOC, the procedure is the same as in Figure 9B, 
except the DNA for this experiment came from Figure 9B, lane 3. 

A total of 43 random clones obtained from micro-column 
chromatography fractionation were analyzed for insert size (as shown in Figure 
lOA). Most of these inserts were larger than 500 bp (37/43 or 86%), 11.6% 
(5/43) were smaller than 500 bp, and one clone (2.3%) was smaller than 250 bp. 
The average insert size was 1630 bp. ITiese results are in contrast to those 
obtained by agarose gel fractionation (as shown in Figures lOB and IOC). In the 
first trial (Figure lOB) most of the inserts were smaUer than 500 bp (26/37 or 
70.3%) and only 29.7% (11/37) were larger than 500 bp in size. In the second 
trial (Figure IOC) all of the inserts (40 total) were smaller than 500 bp. Hius, 
the use of agarose gel dectroelution for the size fractionation of DNA results in 
unexpectedly variable and low cloning efficiencies. 

Example 16 

Cloning Sonicated and CviJI**-Digested Lambda DNA 

To compare the cloning efficiencies of sonicated and CViJI**- 
digested nucleic acid, X DNA was fragmented by each of these methods and 
Ugated to pUa9 which was linearized with HincU aad dephosphorylated to 
20 minimize self-ligation. 

DNA fragmented by CwTI** digestion and sonication was cloned 
both before and after Sephacryl 8-500 size fractionation. Sonicated lambda DNA 
was subjected to an end repair treatment prior to Ugation. Ligations were 
performed as described in Example 11. One-tenth of the ligation reaction (2 /xl) 
was utilized in the transformation procedure, and the fraction of nonrecombinant 
(blue) versus recombinant (white) colonics was used to calculate the efficiency of 
this process. 
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The efficacy of the methods was quantified by comparing the 
cloning efficiency of lambda DNA fragments generated either by sonication or 
O/JI partial digestion. To reduce potential cloning differences based on size 
preference, the size distribution of the DNA generated by these two methods was 
closely matched. Other experimental details were designed to reduce potential 
variables, as described above. Certain variables were unavoidable, however. For 
example, the sonicated DNA fragments required an enzymatic step to rq>air the 
ragged ends as described in Example 1 prior to ligation, whereas the CvOf* 
digests were heat-denatured and directly ligated to HincU digested pUC19. 

A summary of the cloning efficiencies calculated from two 
independent trials is given in Table 4, section A (unfractionated samples), and 
Section B (fractionated samples). 
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Cloning efficiencies represent the number of ampicillin-resistant 
colonies/ng pUC19 DNA. CIAP indicates treatment with calf intestinal alkaline 
phosphatase. ER 1 and ER 2 are end repair methods described in Example 13. 
X refers to bacteriophage lambda. 

The indicated trials represent repeated experiments in which two 
identical sets of lambda DNA fragments generated by Alul complete digestion, 
cm partial digestion, or sonication were each ligated to fl/ncn-linearized, 
dephosphorylated pUC19 and transformed into DH5aF' competent ceUs. The 
cloning efficiencies reported are the average of tripUcate platings of each ligation 
reaction. In case the Sephacryl S-500 size fractionation step introduced inhibitors 
of Ugation or transformation or resulted in differences attributable to the size of 
the material, the sonicated and CwJI**-digested samples were Hgated with pUC19 
both prior to (A) and after (B) the fractionation steps. ITie first three rows in 
Table 4, sections A and B, are controls performed to estabUsh a baseline to better 
15 evaluate the various treatments. These data show that supercoiled pUC19 
transforms E. cott 200-1000 times more effidenUy than the ffincE-restricted and 
dephosphorylated plasmid. Without this dephosphotylation step, the cloning 
efficiency is 10% that of the supercoUed molecule (data not presented). The 
background of blue colonies averaged 0.5% in these experiments, compared to 
supercoiled plasmid, while the background of white colonies averaged 0.005%. 

A comparison of the data from unfractionated versus fractionated 
samples in Table 4, sections A and B, reveals a general decline in the number of 
white and blue colonies obtained after sizing. This decrease is primarily due to 
the fact that cloning efficiencies are dependent upon the size of the fragment, 
favoring smaller fragments and thus giving higher efficiencies for thi 
unftactionated material. This is iUustrated by comparing the efficiency of cloning 
unfractionated and fractionated X DNA which was completely restricted with AM. 
This four base recognition endonuclease produces blunt ends and cuts X DNA 
(48,502 bp) at 143 sites. Only 25 of the resulting 144 fragments (17%) are larger 
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than 500 bp. The number of white colonies obtained when unfractionated X 
DNA, completely restricted with AM, was cloned into pUOP ranged from 250- 
400/ng vector, versus 23-48/ng vector for the fractionated material. This ten fold 
decrease was only noticed for the X Alu I digests, and probably reflects the large 
portion of smaU molecular weight fragments (approximately 75%) which is 
excluded from the fractionated ligation reactions. 

The number of white colonies obtained when unfractionated CWn** 
treated X DNA was cloned into pUC19 ranged from 160-340/ng vector, versus 68- 
90 white colonies/ng vector if the same material was fractionated. Unfractionated 
X DNA. completely digested with Alul, results in cloning efficiencies very similar 
to unfractionated CVOT** treated DNA. Sonicated X DNA is a poor substrate for 
ligation, compared to CvJI** treatment, as indicated by the roughly ten-fold 
reduced cloning efficiencies. 

Enzymatic repair of the ragged ends produced by sonication results 
in an increased cloning efficiency. Using conditions described in Example 13 for 
the first end repair treatment (ER 1), 10^ (fractionated) and 19-32 
(unfractionated) white colonies/ng vector were observed. However, ER 1 
conditions may not be optimal, as an alternate end repair reaction (ER 2) (as 
described in Example 13) resulted in greater numbers of white colonies (63 and 
100/ng vector for fractionated and unfractionated DNA, respectively). In this 
reaction, a ten-fold excess of reagents and enzymes were utilized to repair the 
sonicated DNA. which apparenUy improved the efficiency of cloning such 
molecules by two to three fold. ITie data coUected from multiple cloning trials 
in Table 3. sections A and B. show that CWn** partial digestion results in three 
to sixteen times the number of white colonies than sonicated ER l-treated DNA. 
Even with an optimal end repair reaction for the sonicated fragments. DNA 
treated with Cm** yielded three times more white colonies. 
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Example 17 

Analysis of CviH*" I^afimeiitatlon for Shotgun Qoning and Sequencing 

The ability of Cvof* partial digestion to create uniformly 
representative clone libraries for DNA sequencing was tested on pUC19 DNA. 
PUC19 DNA was digested under CVfll** conditions and size fractionated as 
described above. The fractionated DNA was cloned into the EcoRV site of 
M13SPSI, a lacZ minus vector constructed by adding an EcoRV restriction site 
to wild type M13 at position 5605. M13SPSI lacks a genetic cloning selection 
trait, therefore after Ugation of the pUC19 fragments into the vector the sample 
was restricted with EcoRV to reduce the background of nonrecombinant plaques. 
Bacteriophage M13 plaques were picked at random and grown for 5-7 hours in 2 
ml of 2XTY broth containing 20 ^1 of a DH5aF' overnight culture. After 
centiifiigation to remove the ceUs, single-stranded phage DNA was purified using 
Sephaglass" as specified by the manufacturer (Pharmacia LKB. Piscataway New 
Jersey). n,e single-stranded DNA was sequenced by the dideoxy chain 
termination method using a radiolabeled M13-specific primer and Bst DNA 
polymerase (Mead et aJ.. Biotechniques 11:76-87 (1991)). The first 100 bases of 
76 randomly chosen clones were sequenced to determine which CvOI recognition 
site was utilized, the orientation of each insert and how effectively the cloned 
fragments covered the entire molecule, as shown in Figure 11. n,e positions of 
the 45 normal cm sites (PuGCPy) in pUC19 are indicated beneath the line 
labeled "NORMAL" in the Figure 11. Similarly, the 160 CMJl* sites (GC) are 
indicated beneath the line labeled "RELAXED" in Figure 11. The marks above 
these lines indicate the CwJI** pUC19 sites which were found in the set of 76 
sequenced random clones. The frequency of cloning a particular site is indicated 
by tiie height of the line, and the left or right orientation of each clone is also 
indicated at the top of each mark. Iliere are a total of 205 CV/n and Cvijf sites 
in pUC19. 
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The data presented in Figure 11 demonstrate that, under Cvijf* 
partial conditions, normal CviJl sites are preferentially restricted over relaxed 
{CvOI ) sites. Of the 76 clones that were analyzed, only 13%, or 1 in 7, had 
sequence junctions corresponding to a relaxed Cvin* site. TTiirty-five of the 
forty-five possible normal restriction sites were cloned, as compared to eight of 
the possible one hundred sixty relaxed sites. If the enzyme had exhibited no 
preference for normal or relaxed sites under the CvOl** partial conditions utilized 
here, then 78% of the sequence junctions analyzed should have been generated by 
cleavage at a relaxed CvOI* site. It may be noted that the relaxed CwJI* 
restriction sites that were found appear to be clustered in two regions of the 
plasmid that are deficient in normal CvOI sites. In addition, the combined 
distribution of the normal and relaxed sites which were restricted to generate the 
76 clones appean to be quasi-random. That is, the longest gap between cloned 
restriction sites was no greater than 250 bp and no one particular site is over- 
15 utilized. 

A detailed analysis of the distribution of CVOI** sequence junctions 
found from cloning pUC19 is presented in Table 5. 
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The GC sites in pUC19 may be divided into four classes based on 
their flanking Pu/Py structure. The fraction of GC sites observed in pUC19 which 
belong to each classification is roughly equal (22.0-27.8%). A striking difference 
was found between the observed distribution in pUC19 of normal and relaxed (Rl. 
R2, R3) CWn recognition sites and the distribution revealed by shotgun cloning 
and sequence analysis of CVi7I**-treated DNA. While most of the sites cleaved 
by this treatment were found to be PuGCPy (about 87%), or "normal" restriction 
sites, a significant fraction of the cleavage occurred at PyGCPy (about 6.5%) and 
PuGCPu (about 6.6%) sites, considering the short incubation times and limiting 
enzyme concentrations. TTie latter two categories of sites, and presumably the 
PyGCPu sites as well, are completely restricted under "relaxed" conditions, 
provided an excess of enzyme is present and sufficient time is allowed (see Figure 
7, and Xia et al.. Nucleic Acids Res. 15:6075-6090 (1987)). 

Digestion using CwTI** treatment results in a relatively even 
distribution of breakage points across the length of the molecule (as shown in 
Figure 11). As described above, Figure 11 depicts a linear map of pUC19 
showing the relative position of the lacZ' gene (a peptide of /3-galactosidase gene) 
and ampicillin resistance gene (Amp). The marks extending beneath the top line 
(labeled "NORMAL") show the relative position of the 45 normal CviU sites 
(PuGCPy) present in pUC19. The marks above the line are the cleavage sites 
found from sequencing the CwH** partial library. The height of the line 
indicates the number of clones obtained from cleavage at that site, and the 
orientation of the flag designates the right or left orientation of the respective 
clone. The marks extending beneath the second line (labeled "RELAXED") show 
25 die relative positions of the 160 CVOT* sites (GC) present in pUC19. Those marks 
above the line were found from sequencing the CwTI** partial Ubrary. The 
bottom portion of Figure 11 shows the relative position and orientation of the first 
20 clones sequenced, assuming a 350 bp read per clone. CwH** cleavage at 
relaxed sites appears to be important in "filling gaps" left by normal restriction. 
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The primary goal of this effort was to determine the efficacy of 
these methods for rapid shotgun cloning and sequencing. For these purposes, 
only 100 bases of sequence data were acquired per clone. However, if 350 bases 
of sequence had been determined from each clone, then the entire sequence of 
PUC19 would have been assembled from the overlap of the first 20 clones (Figure 
11). In this sequencing simulation 75% of pUC19 would have been sequenced 
at least 2 times from the first 20 clones. The highest degree of overfold 
sequencing would have been 6. and only involved 2.2% of the DNA. Figure 11 
also shows that most of the Ix sequencing coverage occurred in a region of the 
plasmid with a very low density of normal and relaxed CviJl restriction sites. 
Most of the single coverage occurs in a 240 bp region of the plasmid between 
1490 bp and 1730 bp where there are only 4 CvOl relaxed sites. It should also 
be noted that by the 27th randomly picked clone most of this region would have 
been covered a second time. 

Shotgun sequencing strategies are efficient for accumulating the 
first 80-95% of the sequence data. However, the random nature of the method 
means that the rate at which new sequence is accumulated decreases as more 
clones are analyzed. In Figure 12 the total amount of unique pUC19 sequence 
accumulated was plotted as a function of the number of clones sequenced. TTie 
points represent a plot of the total amount of determined pUC19 sequence versus 
the total number of clones sequenced. TTie horizontal dashed line demarcates the 
2686 bp length of pUC19. The smooth curve represents a continuous plot of the 
discrete fimction S(N)=NLe-cSf((eCs.i)/c)-H(l-s)]. The theoretical accumulation 
curve expected for a process in which sequence information is acquired in a 
totally random fashion is also shown. The smooth curve is a continuous plot of 
the discrete function S(N) where 

S(N)=NLe-C''[((eC'^.l)/c+(i.a)]. 
This equation is based upon the results developed by Lander et al. Genomics 
2:231-239 (1988) for the progress of contig generation in genetic mapping. In the 
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equation: N is the number of clones sequenced, L is the length of clone insert in 
bp. c is the redundancy of coverage or LN/G (where G is length of ftagment 
being sequenced in bp), and <r = 1-9, where 9 is the fraction of length that two 
clones must share. The curve in Figure 12 was calculated with G = 2686 bp, L 
= 350 bp, and (T = 1. The plotted pointe lie close to the theoretical curve, and 
it thus appears that the sequence of pUC19 was accumulated in an apparent 
random fashion utilizing CviJl** fragmentation and column fractionation. 
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Example 18 

Shotgun Cloning Utilizing 200 ng of Lambda DNA 



Generally, 2-5 /xg of DNA are needed for the sonication and 
agarose geJ fractionation method of shotgun cloning in order to provide the 
several hundred colonies or plaques required for sequence analysis (Banlder et al 
Methods in Enzymol 155:51-93 (1987)). A ten-fold reduction in tiie amount of 
substrate required greatly simplifies the construction of such Kbraiies, especiaUy 
from large genomes, pavidson, J. DNA Sequencing and Mapping 1:389-394 
(1991)). The efficiency of constructing a large shotgun library from nanogram 
amounts of substrate was tested utilizing 200 ng of CWn**-digested lambda DNA. 
This material was column-fractionated as described previously. In tiiis case, 1/2 
of the column eluant (15 /xl containing 50 ng of DNA) was Ugated to 100 ng of 
JB/icn-digested and dq)hosphorylated pUC19 as described in Example 15. The 
cloning efficiencies of the control DNAs were sinular to tiiose reported in Tables 
2 and 3. The 50 ng cloning experiment yielded 230 white colonies per ligation 
reaction in one trial, and 410 white colonies per ligation reaction in a second trial. 
Thus, it should be possible to routinely construct useful quasi-random shotgun 
25 libraries from as Uttie as 0.2 - 0.5 ng of starting material. 
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Example 19 
Epitope Mapping 

CwJI* recognizes the sequence GC (except for PyGCPu) in the 
target DNA, Under partial restriction conditions the length of fragment may be 
controlled by incubation time. Epitope mapping using CWJI** partial digests 
involves generating DNA fragments of 100-300 bp from a cDNA coding for the 
protein of interest, by methods described in Example 13, inserting them into an 
M13 expression vector, plating out on solid media, lifting plaques onto a 
membrane, screening for binding to the ligand of interest, and picking the positive 
plaques for isolation of the DNA, which is then sequenced to identify the epitope. 
Thus, the same epitope may be expressed as a small fragment or a larger 
fragment. This approach allows one to determine the smallest fragment 
containing the epitope of interest using functional assays such as binding to an 
antibody or other ligand, or using a direct assay for activity. For insertion into 
an M13 vector, linkers may be added to the fragments or the insert may be 
dephosphorylated to ensure that each fragment is cloned alone without ligation of 
multiple inserts. 

The expression vectors recommended for subcloning of the CViJI 
fragments are Lambda Zap~ (Stratagene, LaJoUa, California) or bacteriophage 
M13-q>itope display vectors. An advantage of using an M13-based vector is that 
the peptide or protein of interest may be displayed along with the Ml 3 coat 
protein and does not require host cell lysis in order to analyze the protein of 
interest. The lambda-based vectors yield plaques and hence the protein can be 
directly bound to a membrane filter. 
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Example 20 
CGase I 

CGase I as used herein, refers to a restriction endonuclease reagent which 
cleaves DNA at the dinucleotide CG. CGase I activity is based on the combined 
star activities of the restriction endonucleases Hpa D and Taq I. Under normal 
reaction conditions (10 mM Bis Tris Propane-HCl pH 7.0, 10 mM MgCl2, 1 raM 
DTT; 1 unit of enzyme/Mg DNA, 37°C for 1 hr), Hpa H recognizes CCGG and 
cleaves after the first C to leave a 2-base 5' overhang. Under normal reaction 
conditions (100 mM NaCl. 10 mM Tris-HCl pH 8.4, 10 mM MgClj, 10 mM 2- 
mercaptoethanol, 1 unit of enzyme//xg DNA, BS^C for 1 hr) the restriction 
endonuclease Taq I recognizes TCGA and cleaves after the T to leave a 2-base 
5' overhang. 

Reaction conditions have been described for Taq I* activity which decrease 
the cleavage specificity of Taq I (10 mM Tris-HCl pH 9.0, 5 mM MgQj, 6 mM 
2-mercaptoethanol. 2095 DMSO; 2000 units of enzyme//zg DNA, 65*'C for 1 hr) 
(Barany. Gene, 65:149-165 (1988)). These reaction conditions allow Taq I* to 
cleave DNA at the following sequences: 

Taq I* TCGA 
CCGA(TCGG) 
20 ACGA (TCGT) 

TCTA (TAGA) 
TCAA (TTGA) 
GCGA (TCGC) 



15 



25 



We are unaware of any literature descriptions of Hpa n* conditions. 
However, the foUowing conditions were established to promote Hpa H* activity 
which are also compatible with Taq I* activity: 5 mM KCl, 10 mM Tris-HCl pH 
8.5, 10 mM MgClj, 1 mM DTT, 15% DMSO, 100 ug/ml BSA (CGase buffer); 
50 units of enzyme/Mg DNA SO'^C for 1 hr. The Hpa n* recognition sites were 
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detennined by clomng and sequencing Hpa U* restricted fragments. Tlie 
characterized Hpa H* recognition sequences are as follows: 

Hpa n* CCGG 
CCGC (GCGG) 
CCGA (TCGG) 
ACGG (CCGT) 

Taq I (400 units/^g DNA) and Hpa H (50 units/^g DNA) were then 
combined (CGase 1) in CGase I buffer and the following recognition sites were 
identified by cloning and sequencing restricted pUC19 fragments. 



CGase I GCGC 
TCGA 
CCGG 
GCGT 

15 ACGA 

ACGG (CCGT) 
GCGG (CCGC) 
CCGA (TCGG) 



CGase I restriction of natural DNA. (i.e. pUC19. lambda), results in fragments 
ranging from 20-200 bp in length (average 20-60 bp). Heat denaturation of these 
fragments generates numerous oligonucleotides of variable length but precise 
specificity for the cognate template as was the case with CviJ I* digestion. CGase 
I restriction of the small plasmid pUC19 (2689 bp) theoreticaUy yields 174 
restriction fragments, or 384 oligonucleotides after a heat denaturation step. 

•nie "two-cutter" activity of CviJ I* and CGase I represent a unique class 
of restriction endonuclease activity in that no other known restriction 
endonucleases wiU generate this size range of oligonucleotides. n,e abiUty to 
generate numerous oligonucleotides with perfect sequence specificity from any 
DNA, without regard to sequence composition, genetic origin, or prior sequence 
knowledge is one of the properties that CGase I shares with CvJ I*. In addition, 
the generation of numerous oligonucleotides by CviJ I or CGase I results in a 
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form of probe or primer amplification not practical using conventional means of 
organic synthesis. 

Based on ability to recognize a dinudeotide sequence, the present invention 
contemplates the interchangeabiUty of CGase I with CviJ I* in all of the 
applications described herein. 
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Example 21 

Purification of CvlJ I Restriction Endonuclease from 
n^3A-Infected Chlorella Cells 

CvU I was prepared by a modification of the method described by 
Xia et al.. Nucl. Acids Res. 15:6025-6090 (1987). Chlorella NC64A cells 
(ATCC Accession No. 75399 deposited on January 21, 1993, American Type 
Culture Collection, Rockville. Maryland) were infected with the virus IL-3A 
(ATCC Accession No. 75354 deposited November 6, 1992, American Type 
Culture Collection, RockviUe, Maryland) according to Van Etten et al, Virology 
15 126: 117-125 (1983). Five grams of IL-3A infected Chlorella NC64A ceUs were 
suspended in a glass homogenization flask with 15 g of 0.3 mm glass beads in 
buffer A (10 mM Tris-HCl pH 7.9, 10 mM 2-mercaptoethanol. 50 ^g/ml 
phenylmethyisulfonyl fluoride (PMSF), 20 ug/ml benzamidine, 2 /ig/ml o- 
phenanthroline). CeU lysis was carried out at 4000 ipm for 90 sec in a Braun 
MSK mechanical homogenizer (Allentown, PA) with cooling from a CO2 tank. 
After lysis 2 M NaQ was added to a final concentration of 200 mM, after which 
10% polyethyleneimine (PEI) (Life Technologies, Bethesda, MD) (pH 7.5) was 
added to a final concentration of 0.395. TTie mixture was then stirred for 2 hrs. 
at 4*'C then centrifuged for 1 hr. at 50.000 g. Ammonium sulfate was added to 
the supernatant to 70% saturation and stirred overnight. A protein peUet was 
recovered by centrifugation for 1 hr. at 50,000 g. The resulting pellet was 
dissolved in 20 ml of buffer B (20 mM Tris-acetate pH 7.5, 0.5 mM EDTA, 10 
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mM 2-mercaptoethanol, 10% glycerol, 30 mM KCl, 50 ug/ml PMSF, 20 ^g/ml 
benzamidine [Sigma. St. Louis, Missouri), 2 ^g/ml o-phenanthroline [Sigma]) and 
dialysed against 500 ml of buffer B with 3 changes. Hie dialysed solution was 
then appUed to 1 x 6 cm Hepaiin-Sepharose (Pharmacia LKB. Piscataway, New 
Jersey) column. After a 50 ml wash with buffer B, a 100 ml gradient of 0 to 0.7 
M KCl in buffer B was run. Fractions having CviJ I activity as measured by 
digestion of pUC19 DNA and agarose gd electrophoresis, were pooled, dUuted 
in 5 volumes of buffer C (10 mM K/P04 pH 7.4, 0.5 mM EDTA, 10 mM 2- 
mercaptoethanol, 75 mM NaCl,0.05% Triton X-100, 10S5 glycerol. 50 /zg/ml 
PMSF, 20 /ig/ml benzamidine. 2 ^g/ml o-phenanthioline) and applied to a 1 x 7 
cm PhosphoceUulose Pll (Whatman) column equilibrated in buffer C. After 
washing with 30 ml of buffer C. CvU I was eluted by a 100 ml gradient of 0 to 
0.7 M Nad in buffer C. At this step CviJ I activity separated from non-specific 
nucleases. CviJ I containing fractions were pooled and diluted in 4 volumes of 
buffer C and applied to a 1 x 4 cm hydroxyapatite HTP column (BioRad, 
Hercules, CA). After washing with 30 ml of buffer C, CvU I was eluted by a 0 
to 0.7 M potasium phosphate (pH 7.4) gradient in buffer C. Active fractions 
containing CvU I activity and lacking non-specific nuclease activity were pooled 
and were dialysed overnight against storage buffer (50 mM potassium phosphate 
200 mM KCl, 0.5 mM EDTA, 50% glycerol. 20 ug/ml PMSF were pooled) and 
stored at -20°C. 

Although the present invention has been described in types of 
preferred embodiments, it is intended that the presem invention encompass all 
modifications and variations which occur to those skilled in die art upon 
consideration of the disclosure herein, and in particular those embodiments which 
are within the broadest proper interpretation of the claims and their requirements. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Molecular Biology Rooourceo, Inc. 

(ii) TITLE OP INVENTION: Materials and Methods for 

Restriction Endonuclease Applications 
(iii) NUMBER OP SEQUENCES: 13 
(iv) CORRESPONDENCE ADDRESS: 

1b! ??5?^Tf^«oo*I!^"; ^««tein, Murray Borun 

!?) SSf ch"ago''"" ^^^^^^ 
(D) STATE: Illinois 

(B) COUNTRY: United States of America 
<F) ZIP: 60606-6402 «nerica 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TyPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentin Release /l.O, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION; 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Clough, David W. 

(B) REGISTRATION NUMBER: 36,107 

(C) REFERENCE/DOCKET NUMBER: 28003/31967/PCT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 312/474-6300 

(B) TELEFAX: 312/474-0448 

(C) TELEX: 25-3856 



(2) INFORMATION FOR SEQ ID NOilr 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH! 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS t single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 
CAATTTCACA CAGGAAACAC CTATGTCTTT TCCCACCTTA 



44 
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(2) INFORMATION FOR SEQ ID NO: 2; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5496 baee pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
ATGTCTTTTC CCACGTTAGA ACTATTCCCC GGTATAGCTG GTATTTCACA TGGCCTCAGA 
GGTATATCTA CACCAGTTGC ATTCCTAGAA ATTAATGAAG ACGCACAAAA ATTCTTCAAA 
ACAAAGTTTT CAGATGCATC TGTATTCAAT GACGTTACCA AATTTACCAA ATCGGACTTC 
CCAGAAGACA TAGACATGAT TACTGCGGGA TTCCCGTGCA CTGCGTTTAG TATTGCAGGT 
TCTAGAACTC GATTCGAACA CAAGGAATCC GGTCTCTTTG CTGATGTTGT CCCAATCACG 
GAACACTATA AACCTAAAAT AGTCTTTTTG GAAAACTCCC ATATGTTGTC CCACACTTAC 
AATCTCGATG TCCTCCTAAA AAAGATGCAT CAAATTGGTT ATTTCTCCAA GTGGGTAACT 
TGTCGGGCAT CAATTATAGC AGCCCATCAT CAACGCCACC GGTGGTTTTO TCTCGCGATT 
CGAAAAGATT ATGAACCAGA AGAAATAATT CTATCTGTGA ATCCTACAAA CTTCGACTGG 
GAAAATAATG AACCACCGTG TCAAGTAGAC AATAAGACTT ACCACAATTC AACTCTTGTT 
CCTCTCGCAG CATATTCCGT GGTCCCCGAC CACATCAGAT ATCCTTTCAC CCGTCTATTT 
ACAGGTGATT TTGAGTCATC GTGGAAAACT ACCTTGACAC CTGGGACAAT AATTGGCACC 
GAACACAAAA AAATGAAAGG AACTTACGAT AAAGTCATAA ACGGGTATTA TGAGAACGAT 
GTGTATTATT CTTTTTCAAG GAAAGAAGTT CATCGCGCTC CTCTAAATAT ATCCGTGAAA 
CCACGTGATA TTCCGGACAA ACATAACGGA AAAACACTCG TAGATCGCGA AATGATCAAG 
AAATATTGGT GCACACCATG TGCTAGTTAT CCCACTGCTA CTGCTGGATG CAATGTTCTG 
ACAGACCGTC AGTCACATGC ACTTCCTACA CAAGTCAGGT TTTCATATAG CCCTGTATGT 
GGACGACATT TCTCTGOTAT ATGGTCT6CA TGGTTGATCC CGTATGACCA AGAATATCTT 
GGTTATTTGC TTCAATATQA TTAAAATATT TTGATACACT AAATCGATAT AAGAACAAAA 
CGTTTTACAA TAGAAGGGGC TAAACGTATA ATACTCGAAA AAAAGAGACT TCAAGACAAA 
AAAAGAATTG CGCAACAGAA AAAAAGAATT GCACTTATAG AAAAACAACC AATTCCGGAA 
GACAAAAAAA CAATTGC6GA AGAGAAAAAA CGATTCGCAC TTGAAGAGAA AAAACCAATT 
GCGGAAGAAA AAAAACCAAT CGCGCAACAG AAAAAACGAA TCGTGGAAGA GAAAAAAAGA 
CTTCCACTTA TAGAAAAACA ACGAATTCCG CAAGAGAAAA TTGCGTCGGG GAGAAAAATT 
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AGAAAGAGGA TCTCTACAAA TGCAACAAAA CATGAAACAC AATTTCTCAA AGTTATAAAT 
TCAATCTTCG TCGGACCCGC TACTTTTGTi^ TTCGTAGATA TAAAAGGTAA TAAATCCAGA 
GAAATCCACA ACGTTGTAAG ATTCAGACAA TTACAAGGCA GTAAAGCGAA ATCCCCGACC 
GCGTATCTTG ATAGAGAATA TAACAAACCT AAAGCGGATA TAGCAGCGGT AGACATAACC 
CGTAAAGATC TGGCATCGAT ATCCCATAAA GCATCTGAAG GATATCAACA ATATCTAAAA 
ATTTCTGGAA ACAACCTCAA GTTCACACGA AAAGAATTAG AACAAGTTCT ATCGTTCAAG 
AGAAAACTAG TTAGTATGGC ACCGGTATCT AAAATATGGC CTGCTAATAA GACCGTATGC 
TCTCCTATCA AGTCAAATTT GATTAAAAAT CAAGCAATAT TCGGATTTGA TTACGGTAAG 
AAACCAGGAA GGGACAATGT AGACATCATA GGTCAAGGAC GACCAATTAT AACAAAAAGA 
GGTTCCATAT TATATCTTAC ATTCACTGGT TTTAGCGCAT TAAATCGGCA CTTGGAGAAT 
TTTACTGGGA AACATGAACC CGTTTTCTAT GTAAGAACAG AACGCAGTAG TAGCGGGAGA 
AGTATAACAA CTGTCGTCAA TGGTGTCACT TATAAAAATT TAAGATTCTT TATACATCCA 
TACAACTTTC TTTCTTCAAA AACACAACGT ATTATGTAGG ACCATTTTCC CGAGAGACTT 
TGTTGACCGC CTACTAAAAA ATGGTCACGA TATTTGTCTA AAGATGCTCA TAGAAGCAGG 
TGCAAACCTT GACATCGTCA GTGTTGAGTA TACACCATTA CATCTACATG TCGTGATATT 
TGTATAAACG CTAAATACCT ATATATACAA TACGTATCCC CCTAAAAGCG CTTA6ATTTT 
TTAGTTGTAT ACTACTTTTG TATAAGACCT GTAAGTTACA AACTAAAAGT TTCAGCTTTG 
CCTTCGAAAC AAGCAATTAC CGCATGAGAA TAATATCCAT TATGGATCTT TTCTGCTAAT 
AAAACGATAT TTCCTACAGA AGTTTCTATG ATTAGTTCCG AAATATTCAG ATCATCGTCA 
CGTTTTTCTT TACCGTATTT TACTTTCCTC ATCGTCGCAC CAATAAAATC ATCTCGTGTC 
AGTTCATTCX; GCAATTGTGC CGTGACACCA AATCTCTCAC AACAACCTTG ATGTCCATCC 
ATTGCTAACA CTATCGGTAA TCCATGTGTG GTGTGTACGA CCACACCGTT ATAACTATAA 
CACGTGTAGT TGTCGTCTAT ATCATATAAC TCGAGAGCGG TGTGAACTTC TTCAGATCTA 
TTATTAATCG GATCTGATCC ATAAGAAGAA TCTTCATATT TACAAATAAA ATCATCCGAT 
ATGTTCTGCA CACGAACAAC ATTCCTCAAA TTTCTGTCAT GACGAATCTC CATCTCTGAA 
TCATTAGAGA CTTGCGAGTA TATAACATTA TAATTCTTGA TATGATTATT ACGTTTCATA 
TCAACAAAAT ACATATAAAC ACCATACAAA TATTAAAACA CGTTAGTATA TAATGGATAA 
CATTTGCAAT AGTATATTCA CTGCAGTAAA AAATGGCCAC GAAGCTTGTT TGAAGATGAT 
CCTCATTGAA AGAGGTAGCA ATATCAATGA TGTTTCCGAA TCAAAATATG GAAATACACC 
ACTACATATT CCAGCTCATC ATGGTAATGA TGTGTGTTTG AAGATGCTTA TTCACGCAGG 
TGCAAACCTT GATATCACAG ATATTTCTGG AGGAACACCA CTTCATCGTG CCCTTTTCAA 
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TGGCCATGAC ATATTCTACA CATCCTCCTA CAACCAGCTG CAAACCTTAG TATCATAACT 
AATTTGGGAT GGATACCGTT ACATTACGCC GCTTTTAATG GTAATGAT6C GATTTTGAGG 
ATGCTCATCG TTGTAAGTCA TAATCTTGAC CTTATCAAT6 ATCGCGGTTG GACGGCGTTA 
CATTACGCGG CTTTTAATGG TCATAGCATG TGCGTCAAGA CGCTTATTCA TGCGCCTCCA 
AATCTTGACA TCACAGATAT TTCGGCATGT ACACCACTTC ATCGTGCGGT TTATAATGAC 
CACGATGCAT GTGTGAAGAT ACTOGTAGAA GCACGTGCAA CTCTTGACGT CATTGATGAT 
ACTGAGTGGG TGCCGTTACA TTAOGCGGCT TTTAATGGTA ATGATGCGAT TTTGAGGATG 
CTCATTGAAG CAGGTGCAGA TATTGATATA TCTAATATAT GTGATTGGAC GGCGTTACAT 
TACCCCCCTC 6AAATGGACA CGATCTGTGT ATAAAAACAC TCATCGAAGC AGGTGGTAAC 
ATCAACGCCC TCAACAAATC CGGGGATACA CCACTAGATA TTGCAGCATG TCATGACATT 
GCAGTATGTG TGATCCTCAT AGTCAATAAG ATCGTTTCGG AGCGGCCGTT GCGTCCGAGT 
GAGTTGTGTG TC^TACCACC AACCTCTCCT CCATTAGGTG ATGTGTTGCG AACGAC6ATG 
CGGCTTCATG GGCGATCGGA AGCT6CAAAG ATCACAGCGC ATCTTCCTGT GGGTCCAAGC 
GATACTCTAC GAACTACTOC 6TTGTGTTT0 AACCCAACAA TTTCCGAGA6 ATCTCCTTOA 
TACTGTATTA ATTGAATGCC T6TAAA0TTA CGCTATTTTT TTCCAAAAAG GGTTTGCATG 
AAATACAACA CCATCTTTTG TACATOGTTT ACCATTAGTT 6TATTCGTGC AATAGAOACC 
ATACGTACCT CCAAATTCAT TTACTTTACC TACAGTATTA CCACTTCCTT TTTTTCCTAT 
AOTAGTATCT AAATTCAACC CTTTGAACTC ATCGCCATTA ACAGACAGAG CGTATGAACC 
GTTTTGT6CC AATTTC»CCT TCAAAACGAT AGTAACCCAT TGACCTCTAC GAATTTTAAC 
CGATCTTATA AGTATCTCCT TACTTCCAAC TCCTTTTTCA AAAGCATACA ACGATCCTGT 
AAGCTTATCC CCAGAACCTG AAATTGTAAA GAACGACTGG AAATGAATAG GTTGCATTAG 
ATCTCTATAC ATATCACTTC CTTCGAAATG AAAATCGTAG TCCCAATTAG GTACGTTCCA 
CCAAGTTTAA TACGCCGTCT TTCCACCGAC ACCGGACATT TCAGCACOAG CCTTGTAAGA 
ATGATATGAT GTGGTTAAAT CTCTATCACC ATCGTTCCAC TTTCCTCTGA ACCGAAGACC 
ATCCATCGTT ATACCTGGTG CAACCTCTAC TAAATTCTTT ATTTCAGCT6 CGGCTCCGGG 
TGGATTAACT CGAGATTC6T CAAATCTAAA ATAT6ATAAC GATGTTCCAA CAGTAGAACC 
ACTGGGTCGT ATGGCAGTTG CTGGAAGGGA AGGTAAAACT TTAGGATATT TCAAATCACC 
AACACCTTGA GGCTTTACTT GAATACTTCT GGGAGATGTT GGTGCTTTCC TCGAAGGTGG 
TTTCGTTGAA GGTGGTTTCG TCGAAGGTGG TTTCGTCGAA GGTGGTTTC6 TCGAACCTGC 
TTTCCTCGAA GGTGGTTTCG TCGAAGGTGG TTTCGTCGAA GGTGGTTTCG TCGAAGGTGG 
TTTCGTCGAA GGTGGTTTCG TCGAAGGTGG TTTCGTCGAA GGTGGTTTCG TCGAAGGTGG 
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TTTCCTCGAA GGTGGTTTCG TCGAAGGTGC TTTCGTTCGC 
ATCCGTTAAA TTCCCGCATT CACCTAATGA TCTACTCCAT 
CATTCTTATT CGTTCTGTAG TATCAGATAT ACATACGAAA 
GCCAAATAAT TTACCAGATT TGCCTTTACA TGACATTATT 
ATTTTAAAAA AACTAACCTC TATTTAAAAT TATGTAATAC 
CTTAATCATT TCCTAACGTA TAACCCTACC GAATTC 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1225 base pair© 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: eingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: join(1..33, 55.. 1128) 



CCAACTCCGC CATGACCATA 
AAAGAACCGG C^CCCATTC 
TAATGAGAAT CATTTTCCCT 
TGTAATATAA TATTATTATA 
GTATTATATC AATGCATCAT 



5220 
5280 
5340 
5400 
5460 
5496 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

^ s?j s ;s 

ill lil lit f ^ CAA GCC OCT AAA CGT 

Met ABp lie Arg Arg Lys Arg Phe Thr lie Clu Gly Ala Lye Arg 
15 20 25 

ill Hi f !^ ^ ^ GAG AAA AAA AGA ATT CCG CAA 

He lie Leu Clu Lye Lye Arg Leu Glu Clu Lye Lye Arg He Ala Clu 
^° 35 40 

C?u SIJ JJJ JTI ^ f^' ^ «^ CGA ATT GCG CAA GAG 

».i.u Lye Lye Arg lie Ala Leu He Clu Lys Gin Arg He Ala Glu Glu 
«5 50 55 

AAA AAA AGA ATT CCG GAA GAG AAA AAA CGA TTC CCA CTT GAA GAG 
Lya Lye Arg He Ala Glu Glu Lye Lye ArJ Se Su SJS 

65 70 

^ ill A?! t^t'^^'^ C*'^ ATC GCG CAA GAG AAA AAA CGA 

Lye Arg He Ala Glu Glu Lye Lye Arg He Ala Clu Clu Lye Lye Arg 

85 90 

ill vS §JJ oiu ^ ^ f " CAA AAA CAA CCA ATT 

lie val Glu Clu Lye Lye Arg Leu Ala Leu He Glu Lye Gin Arg lie 

100 X05 

Sa §S llu ^ T;"" c"^ ^ A" AAG ACG ATC TCT 

Ala Glu Glu Lye He Ala Ser Gly Arg Lys He Arg Lye Arg He Ser 
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^ CAA TTT GTC AAA GTT ATA AAT TCA 

Thr Asn Ala Thr Lya Hie Glu Arg Glu Phe Val 5S S J^J s« 

130 

ATG TTC GTC GGA CCC GCT ACT TTT CTA TTC GTA GAT ATA AAA GGT AAT 
Met Phe val Gly Pro Ala Thr Phe Val Phe Val Aep He ^2 Gil 

14S 

AAA TCC AGA GAA ATC CAC AAC GTT GTA AGA TTC AGA CAA TTA CAA CCC 
Lya Ser Arg Glu He Hia Aan Val Val Arg Phe ArJ SJ 

160 165 175 

Jer f 0*^° f AGA GAA TAT AAC AAA 

ser Lya Ala Lya Ser Pro Thr Ala Tyr Val Aap Arg Glu Tyr Aan Lye 

180 185 

irl fJS Ifl tf^ tT* f?" "'^^ ACC GGT AAA GAT GTO GCA 

Pro Lya Ala Aap He Ala Ala Val Asp He Thr Gly Lya Aap Val Ala 
" 195 200 

T?^ ^ TAT CAA CAA TAT CTA AAA ATT 

Trp He Ser Hxa Lya Ala Ser Glu Gly Tyr Gin cin SI ^ J" 

210 215 

III t"'^ ^ ^ <^AA TTA GAA CAA GTT CTA 723 

ser Gly Lya Aan Leu Lya Phe Thr Gly Lya Glu Leu Glu ClS Val Si 

225 230 

|» ^. 5S ;s ?:t is jjj ?ij - 

245 250 

^ ?s 52 - - - - jj:; i?; 

-^33 260 ' 



435 



483 



531 



579 



627 



675 



771 



819 



265 

|s §fj ^ jj; s.« 

v=I^ in s s?j s?« ?ii s ?s 5 1;^* s?j 

S; ZL* ^ ?s js S5 J s:.' s:? S !sj 
s |s ^ H=jj ?K f « S - - - - 

"° 325 

oj; ^ i:^ ;s s 5s is: ;s 

"5 340 

355 



867 



915 



963 



1011 



1059 
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s2 ij^ S ill ^^^''^^'^'^^^ TTCCCGACAG ACTTTCTTGA nsB 



365 

CCGCGTACTA AAAAATGGTC ACCATATTTG TCTAAAGATG CTCATAGAAG CAGGTGCAAA 
CCTTGAC 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 369 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Gin Glu Tyr Leu Cly Tyr Lou Val Gin Tyr Aep Met Aap He Arg Arg 

10 is 

Lys Arg Phe Thr He Clu Gly Ala Lya Arg He He Leu Glu Lys Lye 

Arg Leu Clu Glu Lye Lys Arg He Ala Glu Glu Lys Lys Arg He Ala 

40 45 

Leu He Glu Lys Gin Arg He Ala Glu Glu Lys Lys Arg He Ala Glu 

33 60 

Clu Lya Lye Arg Ph. Al. Leu Clu Glu Ly. Lys Arg He Ala Glu Glu 

7S 80 
Lys Lye Arg lie Ala Glu Glu Lya Lye Arg He Val Glu Glu Lye Lya 

90 95 

Arg Leu Ala Leu He Glu Lye Gin Arg He Ala Glu Glu Lye He Ala 

105 HQ 

Ser Cly Arg Lye He Arg Lye Arg He Ser Thr Aen Ala Thr Lye Hie 

120 

Clu Arg Glu Phe Val Lye Val He Asn Ser Het Phe Val Gly Pro Ala 

J.J5 140 

Thr Phe Val Phe Val Asp He Lye Gly Aen Lye Ser Arg Glu He Hie 

155 160 
Aen val Val Arg Phe Arg Gin Leu Gin Gly Ser Lya Ala Lye Ser Pro 

170 

Thr Ala Tyr Val Aep Arg Clu Tyr Aen Lye Pro Lye Ala Aep He Ala 
Ala val ABp He Thr Gly Lye Aep Val Ala Trp He Ser Hie Lye Ala 



121S 
1225 



ser Glu Gly Tyr Gin Gin Tyr Leu Lye He Ser Gly Lye Aen Leu Lye 



205 

220 
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Phe Thr Gly Lye Glu Leu Glu Clu Val Leu Ser Phe Lye Arg Lye Val 

° 235 240 

Val Ser Met Ala Pro Val Ser Lye lie Trp Pro Ala Aen Lye Thr Val 
2« 250 ' 255 

Trp ser Pro lie Lye Ser Aen Leu He Lye Aen Gin Ala He Phe Gly 
2fiO 265 270 

Phe Aep Tyr Gly Lye Lye Pro Gly Arg Aep Aen Val Aep He He Gly 
^'^ 280 285 

Gin Gly Arg Pro He He Thr Lye Arg Gly Ser He Leu Tyr Leu Thr 

Phe Thr Gly Phe Ser Ala Leu Aen Cly Hie Leu Glu Aen Phe Thr Gly 

310 320 

Lye His Glu Pro Val Phe Tyr Val Arg Thr Glu Arg Ser Ser Ser Gly 

330 335 

Arg ser He Thr Thr Val Val Aen Gly Val Thr Tyr Lye Aen Leu Arg 

345 

Phe Phe lie Hie Pro Tyr Asn Phe Val Ser Ser Lye Thr Gin Arg He 

360 

Met 

(2) INFORMATION POR SEQ ID NO: 5 J 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GTAAAACGAC CGCCAGT 
(2) INFORMATION FOR SEQ ID N0i6i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



17 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GCCAAGCTTG CATGAT 



16 
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(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base paire 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
ATCTTCGCCA ATTCACTGGC CGTCGTTTTA C 

3 1 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GAATTCCCCA AGAT 

14 

(2) INFORMATION FOR SEQ ID NOt9j 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 
ATCATCCAAG CTTCGCACTG GCCGTCGTTT TAC 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base paire 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GTAAAACCAC GGCCAOTGAA TTCGCGAACA TNNNNNNNNN NNNNNNNNAT CATCCAAGCT 60 
TGGCACTGGC CGTCGTTTTA C 

81 



33 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 baee pairs 

(B) TYPE: nucloic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GTAAAACGAC GGCCAGTGCC AACCTTGGAT GATNNNNNNN NNNNNNNNNN ATCTTCGCGA 
ATTCACTGGC CGTCCTTTTA C 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 270 baee pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: join(26. .148, 190.. 207, 244.. 270) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

TAACAATTTC ACACAGGAAA CAGCT ATC ACC ATG ATT ACG CCA AGC TCG AAA 

Met Thr Met He Thr Pro ser Ser Lye 
i 5 



60 
81 



52 



IJi S J5 q - ^ - - - - «c ccc CCC XCC .0 

20 25 

ITr 52 ^ s - ™ s s JJS - »» 



35 40 
TCATATAAGT TTCTATATAC GTCATTTCGT TATATCAACA A ATG TTA TCA TAT 

Met Leu Ser Tyr 

TAT ACG TAAAACTCGC TTAAAAAAAA ACGAGGTCTA ACTATA ATG TCT TTT^CGC 

Met Ser Phe Arg 
50 

ACG TTA CAA CTA TTT 

Thr Leu Glu Leu Phe 270 
55 



201 



255 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Thr Met lie Thr Pro Ser Ser Lys Leu Thr Leu Thr Lys Gly Asn 
^5 10 15 

Lys Ser Trp Tyr Arg Gly Pro Pro Ser Arg Ser Thr Val Ser lie Ser 
20 25 30 

Leu He Asn Hie Leu Tyr Asn Lys Arg Met Leu Ser Tyr Tyr Thr Met 
35 40 45 

Ser Phe Arg Thr Leu Glu Leu Pho 
50 55 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRule \3bis) 



A. The indications made bcio^w relate to the microorganism referred to in the descripti 



on page 



ion 



. line 



13 



B. IDENTIFICATION OF DEPOSIT 



Funhcr deposiu are identified on an additional sheet 



Name of deposiury institution 

American Type Culture Collection 



Address of deposiury institution (MuHhs posiQlcodt and coutury) 

12301 Parklawn Drive 
Rockville, Maryland 20852 
UNITED STATES OF AMERICA 



Date of deposit 



November 6, 1992 



Accession Number 

A.T.C.C. 75354 



C ADDmONAL INDICATIONS (Ic^hU^if^ .ppUcbU) Tti, information » cpnt inued on an additional sheet □ 

"In respect of those designations In which a European patent is sought, 
a sample of the deposited microorganism will be made available until the 
publication of the mention of the grant of the European patent or until the 
date on which the application has been refused or withdrawn or is deemed to 
be withdrawn, only by the issue of such a sample to an expert nominated by 
the person requesting the sample (Rule 23(4) EPC)." 



D. DESIGNATED STATES FOR WmCH WDICATIONS ARE MADE (ifik.i^aiio^,n^for.U4aip^eiSuu^) 



E. SEPARATE FURNISHING OF IN DICATIONS Oc^bta^if^^pplUMhte) 



_^ — * For receiving OCfice use only 
I2 Thia sheet was received with thsjni 



temaiional application 



onzed oCDcer 



For International Bureau use onJy 



Q This sheet wis received by the International Bureau 



on: 



Authorized ofTicer 



Form PCr;RCVl>4 (Juiy 1 992) 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRulc Ubis) 



A. The indications made below relate to (he iniaoorganism referred lo in the de^ription 



on page 



79 



, line 



10 



B. IDENTIFICATION OF DEPOSIT 



Further deposiu are identified on an additional sheet fxl 



Name of depositary insiitution 

American Type Culture Collection 



Addreu of dcposiury institution (imliUimg postal cod< cm4 touiary) 

12301 Parklawn Drive 
Rockville, Maryland 20852 
UNITED STATES OF AMERICA 



Date of deposit January 21, 1993 



Accession Number 



A,T,C.C. 75399 



C. ADDmONAL INDICATIONS (lea)ftbUMkif0M •ppiicabU) This informaiioa is ooattnued on an additional sheet Q 



**In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until the 
publication of the mention of the grant of the European patent or until the 
date on which the application has been refused or withdrawn or is deemed to 
be withdrawn> only by the issue of such a sample to an expert nominated by 
the person requesting the sample (Rule 23(4) EFC)." 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (Ifth^ 'uidkauotksm^foeaUdaigmalUous) 



E. SEPARATE FURNISHING OF INDICATIONS {\eBvebU»kif0u* %pplkMbk) 



Tbe indications listed below will be suboined to the Imemattonal Bureau later (specify tk^ garni matitrtefihgutdieaiio ^ cg^ 'Accession 
Sumber of Deposit^ 



For receiving Offjce use only 




For International Bureau use only 



rn Th>> received by the Intenutional Bureau on: 



Authorized ofGcer 



Fonn PCryRQ/134 (July 1992) 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCr Rule I3bis) 



A. The indica.ioM m»dc below relaie to ihe n.,croorgaiu»ra refeired lo in ibe descripi.on 
o»P»«e 31 jine 25 



Further depaija are ideniiTied on an additional sbeei ^ 



AddfMt of deposiurjr iiutiluiion (ineMiitg puul eoJtamtca,m,y) 

12301 Parklavn Drive 
Rockville, Maryland 20852 
UNITED STATES OF AMERICA 



Oite of deposit 



June 30, 199A 



Acceuioa Number 

A.T.C.C. 69341 



C. ADDITIONAL INDICATIONS (Ita^bla^if^ nil 



infonnaiian is continued on an additional sbeei Q I 



"In respect of chose designations in which a European patent is sought 
LhS deposited microorganism will be made alailabL^tll the 

r V """" «f *he European patent ^ untJl the 

brwitSd™ application has been refused or withdrawn or is Seemed to 

• °^ ^"""^ « ««"Pl« " an expert nominated by 

the person requesting the sample (Rule 23(4) EPC)." 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE ft 



(iflheMieatioiu artnelfor aUJaigitattd Staia) 



E. SEPARATE FURNISHINC OF INDICATIONS Ua,y ,bl.^if^.ppUc^, ' ' 



/ 



m This sheet wu received wicb the iitterBationalappfcesnon 



hzed oCQcer 



FonD PCT/RQ/134 (Juiy 1992) 



For Interna lionil Bureau lue oaJy 



Q This sheet was received by the Iniemaiional Bureau 



Authorized officer 
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WE CLAIM: 

1. A purified and isolated polynucleotide encoding a CwJI 
polypeptide or a variant thereof possessing activity characteristic of CWJI, said 
polynucleotide comprising a polynucleotide as set out in SEQ ID NO: 2. 

2. The polynucleotide of claim 1 which is a DNA. 

3. The DNA of claim 2 which is a viral genomic DNA 
sequence or a biological replica thereof. 

4. The DNA of claim 2 which is a wholly or partially 
chemically synthesized DNA or biological replica thereof. 

5. A purified isolated DNA encoding a polypeptide according 
to claim 1 by means of degenerate codons. 

6. A vector comprising a DNA according to claim 2. 

7. The vector of claim 6 which is the plasmid pCJHl .4 (ATCC 
Accession No. 69341). 

8. A host cell stably transformed or transfected with a DNA 
according to claim 2 in a manner allowing the expression in said host ceU of a 
CwTI polypeptide or a variant thereof possessing a sequence specificity 
characteristic of CWJI. 



is E. coli. 



9. 



The host cell according to claim 8, wherein said host cell 
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10. A method for producing a CviJl polypeptide or a variant 
thereof possessing biological activity specific to CV/U, said method comprising the 
steps of: 

a) growing a transformed host cell containing a vector 
according to claim 6 in a suitable nutrient medium; and 

b) isolating the CVJI polypeptide or variant thereof from 

said host cell. 

11. The method of claim 10 wherein said host cell is £. coli. 

12. A recombinant CwTI polypeptide. 

13. A polypeptide produced by the method of claim 10. 

14. A method for restriction cndonuclease digestion of DNA 
comprising the step of digesting DNA with a restriction endonuclease reagent 
under conditions wherein said DNA is cleaved at a dinucleotide sequence selected 
from the group consisting of PyGCPy, PuGCPy, PuGCPu, and wherein Pu = 
purine and Py = pyrimidine. 

15. A method for restriction endonuclease digestion of DNA 
comprising the step of digesting DNA with a restriction endonuclease reagent 
under conditions wherein said DNA is digested at 11 of 16 possible dinucleotide 
sequences and wherein said dinucleotide sequences are selected from the group 
consisting of PuCGPu, PuCGPy, PyCGPy and PyCGPu, and wherein Pu = 
purine and Py = pyrimidine. 



16. The method according to claim 14 wherein said restriction 
endonuclease reagent comprises CvU I. 
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17. A restriction endonuclease reagent, said restriction 
endonuclease reagent comprising in combination, Taq I and Hpa H (CGase I), 
said reagent capable of digesting DNA at 11 of 16 possible dinucleotide 
sequences, said sequences selected from the group consisting of PuCGPu 
PuCGPy, PyCGPy and PyCGPu. and wherein Pu = purine and Py = pyrimidine.' 

18. TTie method according to claim 15 wherein said restriction 
endonuclease reagent is selected from the group consisting of Aci I and CGase I. 



19. The method according to claim 16 wherein said digestion 
of DNA is a partial digestion and wherein said digestion generates quasi-random 
fragments of DNA without apparent site preference as seen on a 1-2 wt. % agarose 
gel. 

20. The method according to claim 18 wherein said digestion of 
DNA is a partial digestion and wherein said digestion generates quasi-random 
fragments of DNA without apparent sitepreference as seen on a 1-2 wt. % agarose 
gel. 

21. The method according to claims 16 or 18 wherein said 
digestion is complete, and wherein said digestion generates DNA fragments from 
about 20 base pairs in length to about 200 base pairs in length and wherein said 
fragments have an average length of about 20 to about 60 nucleotides. 

22. The method according to claims 19 or 20 wherein said quasi- 
random fragments are from about 100 basepairs to about 10.000 base pairs in 
length. 
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23. A methcx! for shotgun cloning and sequencing DNA, 
comprising the steps of: 

a) partially digesting DNA according to claims 19 or 20; 

b) ligating said partially digested DNA into a linearized 
cloning vector thereby creating a recombinant vector; 

c) introducing said recombinant vector into a host cell; 

d) selecting said host cell for the presence of said recombinant 
vector; 

e) growing and ampUfying said host ceU containing said 
recombinant vector; 

f) isolating and purifying said recombinant vector from said 
grown and amplified host cells; and 

g) sequencing said DNA contained in said recombinant vector. 

24. The method according to claim 23 wherein said restriction 
endonuclease reagent comprises CviJ I, 

25. The method according to claim 23 wherein said restriction 
endonuclease reagent comprises CGase I. 

26. The method according to claim 23 wherein said quasi-random 
fragments are from about 100 base pairs to about 10,000 base pairs in length. 

27. The method according to claim 23 wherein said quasi-random 
fragments are from about 500 bp to about 2,000 bp in length. 

28. The method according to claim 23 wherein said cloning vector 
is selected from the group consisting of plasmids, phage, and cosmids. 
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29. The method according to claim 28 wherein said plasmid is 

pUC19. 



30, The method according to claim 28 wherein said bacteriophage 

is \. 



31. The method according to claim 28 wherein said bacteriophage 

is M13. 



32. The method according to claim 23 wherein said host cell is a 

bacteria. 



33. The method according to claim 32 wherein said host cell is E, 

colu 



34. The method according to claim 23 wherein said sequ«icing is 
dideoxy sequencing. 

35. A kit for the shotgun cloning of DNA, said kit comprising in 

association: 

a) a restriction endonuclease reagent, according to 
claims 16 or 18; 

b) a restriction endonuclease buffer; 

c) ligation buffer; and 

d) T4 DNA ligase. 
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36. The kit of claim 35 further comprising in association: 

e) competent host bacteria; 

f) chromatography matrix said matrix useful for the size 
selection of restriction endonuclease digested DNA; 

g) spin filters, said spin filters useful for the size selection of 
restriction endonuclease digested DNA; 

h) a cloning vector; 

i) positive control DNA useful in the monitoring of the 
efficiency of tiie said shotgun cloning; and 

j) molecular size marker DNA. 

37. The kit according to claim 35 wherein said restriction 
endonuclease reagent comprises CviJ I. 

38. The kit according to claim 37 wherein said restriction 
endonuclease buffer endonuclease buffer is CviJ I** buffer. 

39. The kit according to claim 35 wherein said restriction 
endonuclease reagent comprises CGase I. 

40. The kit according to claim 39 wherein said restriction 
endonuclease buffer is CGase I buffer. 

41. The kit according to claim 36 wherein said competent host 
bacteria is competent E. coU DH5aF'. 



42. The kit according to claim 36 wherein said chromatography 
matrix is Sephacryl-S500. 
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43. The kit according to claim 36 wherein said cloning vector is 

M13 mpl8. 



44. A method for labeUng DNA, the method comprising the steps 

a) digesting an aliquot of template DNA with a restriction 
endonuclease reagent according to claim 21 and wherein said 
digestion generates sequence-specific DNA fragments; 

b) mijcing an aliquot of undigested template DNA with said 
sequence-specific DNA fragments, denaturing said mixture of 
template DNA and sequence-specific DNA fragments thereby 
generating denatured template DNA and oUgonucleotide primers. 

c) annealing said primers to said denatured undigested template 
DNA to form a DNA-primer complex; 

d) performing an extension reaction from said primers in said 
DNA-primer complex using a DNA polymerase in the presence of 
one or more nucleotide triphosphates and wherein at least one 
nucleotide triphosphate has a label. 



45. The method according to claim 44 wherein said restriction 
endonuclease reagent comprises CviJ I. 

46. The method according to claim 44 wherein said restriction 
endonuclease reagent comprises CGase I. 

47. The method according to claim 44 wherein said extension 
reaction is performed by a DNA polymerase. 
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48. The method according to claim 47 wherein said DNA 
polymerase is Thermus flavus DNA polymerase. 

49. The method according to claim 44 wherein the one or more 
nucleotide triphosphates are selected from the group consisting of dATP, dCTP, 
dGTP, dUTP and dTTP. 

50. The method according to claim 44 wherein said labeled 
nucleotide triphosphate is selected from the group consisting of ^^p.iabeled 
nucleotide triphosphates and ^^P-labeled nucleotide triphosphates. 

51. The method according to claim 44 wherein said labeled 
nucleotide triphosphate is selected from the group consisting of biotin-labeled 
nucleotide triphosphates, florescein-labeled nucleotide triphosphates, 
dinitrophenol-labeled nucleotide triphosphates, and digoxigenin-labeled nucleotide 
triphosphates. 
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52. A method for thermal cycle labeling DNA comprising the 

steps of: 

a) digesting an aliquot of template DNA with a restriction 
endonuclease reagent according to claim 21 and wherein said 
digestion generates sequence-specific DNA fragments; 

b) mixing an aliquot of undigested template DNA with said 
sequence-specific DNA fragments, denaturing said mixture of 
template DNA and said DNA fragments thereby generating 
denatured template DNA and oligonucleotide primers; 

c) annealing said primers to said daiatured undigested template 
DNA to form a DNA-primer complex; 

d) performing an extension reaction from said primers in said 
DNA-primer complex using a DNA polymerase in the presence of 
one or more nucleotide triphosphates and wherein at least one 
nucleotide triphosphate has a label. 

e) heat-denaturing said labeled extension products; 

f) reannealing said excess primers with said template DNA 
and with said extension products; 

g) performing at least one additional extension reaction from 
said DNA-primer complex using a DNA polymerase. 



53. The method according to claim 52 wherein said restriction 
endonuclease reagent comprises CviJ I. 

54. The method according to claim 52 wherein said restriction 
endonuclease comprises CGase I. 



55. The method according to claim 52 wherein said DNA 
polymerase is a heat stable DNA polymerase. 
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56. The method according to claim 55 wherein said heat-stable 
DNA polymerase is TJiemus/lavus DNA polymeiase or a functional fragment 
thereof. 

57. The method according to claim 52 wherein said extension 
products also serve as templates. 

58. The method according to claim 52 wherein said label is 
selected from the group consisting of fluorescein, dinitrophenol. biotin. and 
digoxigenin. 



59. The method according to claim 52 wherein said label 
selected from the group consisting of ^^p^ 33p^ 3{j^ 14^;,^ 35g 



IS 



60. The method according to claim 52 wherein steps e)-g) are 
repeated up to 20 times. 



61. A kit for labeling DNA, said kit comprising in association: 

a) a restriction endonuclease reagent, according to 
claims 16 or 18; 

b) a restriction endonuclease buffer; and 

c) a labeling buffer. 

62. The kit according to claim 61 wherein said restriction 
endonuclease reagent comprises CviJ I. 

63. ITie kit according to claim 62 wherein said restriction 
endonuclease buffer is CvLT I* restriction endonuclease buffer. 
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64. The kit according to claim 61 wherein said restriction 
endonuclease reagent is selected from the group consisting of CGase I and Aci I. 

65. The kit according to claim 64 wherein said restriction 
endonuclease buffer is CGase I buffer. 

66. The kit of claim 64 further comprising: 

d) a concentrated mixture of 1 or more nucleotide 
triphosphates; 

e) a DNA polymerase; 

f) control DNA. said control DNA being useful for monitoring 
the efficiency of labeling. 

67. The kit according to claim 66 wherein said nucleotide mixture 
is an equimolar mixture of one or more nucleotides selected from the group 
consisting of dCTP, dTTP, dATP, and dGTP. 

68. The kit according to claim 66 additionally comprising a labeled 
nucleotide selected from the group consisting of biotin-ll-dUTP, digoxigenin-l 1- 
dUTP and fluorescein- 11-dUTP. 

69. The kit according to claim 66 additionally comprising a labeled 
nucleotide selected from the group consisting of %-labeled nucleotides. 33p- 
labeled nucleotides, l^c-iabeled nucleotides, 35s.iabeled nucleotides, and %- 
labeled nucleotides. 

70. The kit according to claim 66 wherein said DNA polymerase 
is the Klenow fragment of DNA polymerase 1. 
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71. The kit according to claim 66 wherein said DNA polymerase 
is a thermostable DNA polymerase. 

72. The kit according to claim 66 wherein said thermostable DNA 
polymerase is Themus flavus DNA polymerase. 

73. A method for universal thermal cycle labelling DNA 
comprising the steps of: 

a) mixing an aliquot of template DNA with a holo- 
enzyme of a thermostable DNA polymerase, whereby the 
polymerase provides endogenously purified DNA primers; 

b) denaturing said mixture of template DNA and said 
endogenous DNA primers; 

c) annealing said mixture of denatured template DNA 
and said endogenous DNA primers to form a DNA-primer 
complex; 

d) performing an extension reaction from said 
endogenous DNA primers in said DNA-primer complex 
using said DNA polymerase in the presence of one or more 
nucleotide triphosphates and wherein at least one nucleotide 
triphosphate has a label; 

e) heat-denaturing said labeled extension products; 

f) reannealing said endogenous primra-s with said 
template DNA and with said extension products; 

g) performing at least one additional extension reaction 
from said DNA-primer complex using a DNA polymerase. 
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74. The method according to Claim 73 wherein said heat-stable 
DNA polymerase is Thermus flavus DNA polymerase or a functional fragment 
thereof. 



75. TTie method according to claim 73 wherein said extension 
products also serve as templates. 

76. The method according to claim 73 wherein said label is 
selected from the group consisting of fluorescein, dinitrophenol, biotin, and 
digoxigenin. 

77. The method according to claim 73 wherein said label is 
selected from the group consisting of ^^P, 33p^ 3ji^ Uq^ 35s 

78. The method according to claim 73 wherein steps e)-g) are 
repeated up to 20 times. 



A kit for labeling DNA, said kit comprising in association: 

a) a holo-enzyme of a thermostable DNA polymerase; 
and 

b) a DNA polymerase buffer. 



80. The kit of claim 79 further comprising: 

c) a concentrated mixture of 1 or more nucleotide 
triphosphates; 

d) control DNA, said control DNA being useful for monitoring 
the efficiency of labeling. 
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81. The kit according to claim 80 wherein said nucleotide mixture 
is an equimolar mixture of one or more nucleotides selected from the group 
consisting of dCTP, dTTP, dATP, and dGTP. 

82. The kit according to daim 80 additionally comprising a labeled 
nucleotide selected from the group consisting of biotin-1 1-dUTP, digoxigenin-U- 
dUTP and fluorescein- lI-dUTP. 



83. The kit according to claim 80 additionally comprising a labeled 
nucleotide selected from the group consisting of 32p-iabeled nucleotides, 33p. 
labeled nucleotides, ^^C-labeled nucleotides, ^^s-iabeled nucleotides, and ^h- 
labeled nucleotides. 



84. The kit according to claim 80 wherein said thermostable DNA 
polymerase is Themus aquaticus DNA polymerase. 

85. The kit according to claim 80 wherein said thermostable DNA 
polymerase is Themus flavus DNA polymerase. 

86. A method for labeling of restriction-generated oligonucleotides, 
the method of comprising the steps of: 

a) digesting an aHquot of template DNA according to 
claim 21; 

b) heat denaturing said digested DNA tiiereby generating 
sequence-specific oligonucleotides; and 

c) labeling said sequence-specific oligonucleotides witii 
a label capable of detection. 
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87. The method according to claim 86 wherein said restriction- 
genaated oligonucleotides are labeled on the 5' end. 

88. The method according to claim 86 wherein said restriction- 
generated oligonucleotides are labeled on the 3' end. 

89. The method according to claim 86 wherein the label is 

radioactive. 



90. The method according to claim 86 wherein the label is non- 
radioactive. 
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91. A method for anonymous primer cloning, the method 
comprising the steps of: 

a) digesting an aliquot of template DNA according to claim 21 
thereby generating anonymous DNA ftagments; 

b) digesting a plasmid cloning vector with a restriction 
endonuclease thereby creating a cloning site for insertion of said 
anonymous DNA ftagments; 

c) Ugating the anonymous DNA fragments of step a) into the 
cloning site of step b) thereby creating recombinant plasmids; 

d) transforming competent bacteria with the recombinant 
plasmids; 

e) selecting trasformed colonies; 

f) purifying the recombinant plasmids from said transformed 
bacteria; 

g) digesting the recombinant plasmid with a restriction 
endonuclease said restriction endonuclease being capable of cutting 
said recombinant plasmid at a site, said site lying within the cloned 
anonymous DNA fragment; 

h) annealing one or more extension primers to the digested 
recombinant plasmid, said extension primers being complementary 
to plasmid sequences flanking the anonymous primer; 

1) extending the extension primer in a template-dependent 
fashion in the presence of one or more nucleotide triphosphates and 
a DNA polymerase; and 

j) denaturing the said hybridized extended primer. 



92. The method according to claim 91 wherein said restriction 
endonuclease reagent comprises CviJ I. 
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93. The method according to claim 91 wherein said restriction 
endonuclease reagent comprises CGase I. 

94. The method according to claim 91 wherein said plasmid 
cloning vector is pFEM. 

95. The method according to claim 94 wherein the restriction 
endonuclease of step b) is Eco RV. 

96. The method according to claim 91 wherein said extension 
primer has a label capable of detection. 

97. A kit for anonymous primer cloning comprising in association: 

a) a restriction endonuclease reagent, acconiing to claims 16 or 
18; 

b) a restriction endonuclease buffer^ 

c) a cloning vector; 

d) competent bacteria; 

e) one or more extension primers said extension primers being 
complementary to plasmid sequences flanking said anonymous 
primers; and 

f) a DNA polymerase reagent. 

98. "Die kit according to claim 97 wherein said restriction 
endonuclease reagent comprises CvU I. 



99. The kit according to claim 98 wherein said restriction 
endonuclease buffer is CvU I* buffer. 
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100. The kit according to claim 97 wherein said restriction 
endonuclease reagent is selected from the group consisting of CGase I and Aci I. 

101. The kit according to claim 100 wherein said restriction 
endonuclease buffer is CGase I buffer. 



pFEM. 



102. The kit according to claim 97 wherein said cloning vector is 
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TAACAATITOAGAOAfifiAftfln^nOT ATG ACC ATG ATT ACQ CCA AGC TCG AAA TTA 

MTMITPSSKL 

ACC CTC ACT AAA GGG AAC AAA AGC TGG TAC CGG GGC CCC CC CTCg'aq g TCG 
T LTKG NKSWYRG PP SRS 

ACG GTA TCG ATA AGC TTG ATA AAC CAT TTA TAC AAT AAG CGT TGA TATAAGTTT 
TVSI SLINHLYNKR* 

GTATATACGTCATTTCGTTATATCAACAA ATG TTA TCA TAT TAT ACG TAA AACTGGCT 

M L S Y Y T • 



TAAAAAAAAACGAGGTGTAACTATA ATG TCT TTT rft C ACG TTA r,AA HT A TTT ... 




Ampf^ 
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^t^^^^ Tr??I^?!l!!T fif^^'^^"^*^ TrTrrrnui 

i^ii^s Si??^^^^ ^^'^i^^s* 

CCCCAACAAA ii]IS??f!J SiSM^*^** CAATTGCGCA 
ACCAArTCr^ ^ftifSfffl CfiCCCAACAG AAAAAACCAA 
AA?TKKJi IJ?"T"« GA6AAAAATT 

cAAj?!Sc*i niVji'Ati I'rniiiiii initiiiii 
cmKiic'i :?aN^s^^\\* 

Acm^'StiS r^Vc^VAtt Vcr,lUnr a'Sa\*?5t1Sc 
ucuinir ilt'Atil^l ril"^'^" nicc^G^u'SS 1926 

ii?Ti!*:ii ifCCTATCCC CCTAAAA6CG CTTAGATTTT 
AACTAAAACr TTCACCTTTC CCTTCGAAAC AACCAiTTir 

filii 

lilli 
■■■I 

A AlCCCTAC CCAATTC TTATCTAATA 5*. 

50 1 60 ) 70 I eo 
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M.CvUl 

' I"" l'" r'' 5" S** r'*' n=*^ AATATTTTCATACACTAA ATC GAT ATA 



68 AGA ACA AAA CST TTT ACA ATA CAA GC= OCT AAA CCT ATA ATA CTC CAA AAA AAC ACA CTT 
.28 CAA CAC AAA AAA ACA ATT CCC CAA CAC AAA AAA ACA ATT CCA CTT a'ta CAA MA CAA CCA 
.88 ATT CCC CAA CAC AAA AAA ACA ATT CCC CAA CAC AAA AAA CCA TTC CCA CTT CAA CAC AAA 

«8 AAA CCA ATT CCC CAA CAA AAA AAA CCA ATC CCC CAA CAC AAA AAA CCA ATC C^Tf ^ CAC 
r AAA ACA CTT CCA CTT ATA gAA AAA CAA CCA ATT CCC CAA CAC AAA ATT ^^^T^T^ 
388 ACA AAA ATT ACA AAC ACC ATC[Ta ACA AAT CCA ACA AAA CAT CAA ACA CAA JtT CTC L 
«8 CTT ATA AAT TCA ATC TTC CTC CCA CCC CCT ACT TTT CTA TTC CTA CAT ATA AAA CCT AAT 

r I" r 'h'' r V' V' r §^ g« r r f = r 

548 TCC CCC ACC CCC TAT CTT CAT ACA CAA TAT AAC AAA CCT AAA CCC OAT ATA CCA CCC CTA 
608 CAC ATA ACC CCT AAA CAT CTC CCA TCC ATA TCC CAT AAA CCA TCT CAA CCA TAT CAA CAA 
668 TAT CTA AAA ATT TCT CCA AAC AAC CTC AAC TTC ACA CCA AAA CAA TTA CAA GAA CTT CTA 
728 TCC TTC AAC ACA AAA CTA CTT ACT ATC CCA CCC CTA TCT AAA A^TA ^CC CCT CCT IaT AAC 

r" r V' r r J" r r r f ^ r r l'^ , 
r'' r r V' r ti'' I'' r r s*^ «A CCA Itt ATA ' 

908 ACA AAA ACA CCT TCC ATA TTA TAT CTT ACA TTC ACT CCT TTT ACC CCA TTA AAT CCC CAC 

^°^SALNCH2 
S68 TTC CAC AAT TTT ACT CCC AAA CAT CAA CCC CTT TTC TAT GTA ACA ACA CAA CCC ACT ACT 

.028 ACC CCC ACA ACT ATA ACA ACT CTC CTC AAT =CT CTC ACT TAT AAA AAT TTA ACA TTC TTT 
.088 ATA CAT CCA TAC AAC TTT CTT TCT TCA AAA ACA CAA CCT ATT ATC TAC CACCATTTTCCCCAC ^. 
. .52 «ACTTTCTTCACCCCCTACTAAAAAATCCTCACCArA:rr=rCTAAACATCCTCATACA>icCACCTCCAAACCTTCAC ^ 
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Cviir Digest 
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Cloned Anonymous Primer 



■CArTTTCCTCCCGGTCACTTAAGCCCJTCJArYYYrYYYYYYrYYYrYTACrAGGTTCGAACCG ffG ACCGGCA^^^ 

Ubol\ fgjf^ Primsrl 



1 



Aftoll Digest (or Fo/f I) 
Denature ONA 

Anneal End Labeled Primer 1 (or Primer 2) 



XXXXXXXXXXATCATCCAAGCTTGGCACTGGCCGTCGTTTTAC 

TGACCGGCAGCAAAATG- 



1 



DNA Polymerase 
dNTPs 



XXXXXXXXXXATCATCCAAGCTTGGCACTGGCCGTCGTTTTAC 
YYYYYYYYYYTAGTAGGTTCQAACCGTGACCGGCAGCAAAATG- 



1 



Denature and Separate Primer from Vector 
Labeled Anonymous Primer Ready for Cosmid Sequencing 
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